This is an info Alert.
⌘K
  • Home
  • News
  • Blog
  • Releases
  • LLM history
  • Compare LLMs
  • Library
  • About
Sign in

A blog and notes on development. The easiest way to reach me is via the social links below.

Documents
Terms of UsePrivacy Policy
Contacts
talalaev.misha@gmail.com

© All rights reserved.

Neural Audio Codecs: Audio Compression Using LLM

Mikhail T. (Sh0ny)
Mikhail T. (Sh0ny)
20 июня 2026
  1. Home
  2. Blog
  3. Neural Audio Codecs: Audio Compression Using LLM
1 min read
Updated 4 июля 2026

In short

The French company Kyutai has released the Moshi speech model, which features the Mimi neural audio codec—the first open-source end-to-end AI for real-time conversations. Let’s take a closer look at how these codecs work.

In July 2024, the French company Kyutai unveiled the Moshi model—the world’s first open-source end-to-end voice AI capable of real-time conversation. The key technology behind it is the Mimi neural audio codec.

How does it work?

Instead of directly predicting audio samples, the audio codec operates in three stages:

  • Audio tokenization—converting the audio signal into a sequence of tokens.
  • Predicting the next tokens using an LLM—the neural network learns to predict which tokens will follow.
  • Reconstructing the original — converting the tokens back into sound.

This approach allows for significant compression of audio data without loss of quality, opening up new possibilities for voice interfaces and real-time communication.

Source: Best Posts of the Week

нейронные сетиаудиокодекиllmискусственный интеллектоткрытый код
Liked this write-up? Get one like it in your inbox every week
​

Comments

(0)
​