Voice AI platform Deepgram has unveiled Flux, the world’s first conversational speech recognition (CSR) model.

“Flux redefines what speech recognition can do for real-time AI,” said Scott Stephenson, CEO and co-founder, Deepgram. “For decades, ASR was built to listen and record. Flux is different — it listens, understands, and guides conversations with human-like timing. It’s the foundation voice agents have been waiting for and is our latest milestone towards solving the Audio Turing Test.”
Flux
Flux boasts a unique turn-taking intelligence, enabling conversation-aware recognition that manages timing within the model itself. It features context-aware turn detection and native barge-in handling, facilitating seamless exchanges.
Flux claims to deliver rapid performance, boasting ultra-low latency with ~260ms end-of-turn detection. It also supports eager response generation before a turn is complete.
Flux empowers teams to deploy production-ready agents in a matter of weeks, not months, providing turn-complete transcripts and structured conversational cues.
Flux is designed for scalability, empowering users with enterprise-readyfeatures. It achieves Nova-3 level accuracy, supports GPU-efficient concurrency with over 100 streams per GPU, and offers predictable costs.

“At Vapi, our mission has always been to give engineering teams a platform to build their conversational front door,” said Jordan Dearsley, founder and CEO of Vapi. “Deepgram’s launch of Flux is a perfect example of that vision coming to life. By embedding turn-taking directly into recognition, Flux solves one of the hardest challenges in conversational AI.”