Deepgram launched a native integration with Amazon SageMaker AI that delivers streaming, real-time speech-to-text (STT), text-to-speech (TTS), and the Voice Agent API as real-time endpoints, without requiring custom pipelines or orchestration.
It aims to enable teams to build, deploy, and scale voice-powered applications within their existing AWS workflows while providing the security and compliance assurances of their AWS environment.

“Deepgram’s integration with Amazon SageMaker represents an important step forward for real-time voice AI. By bringing our streaming speech models directly into SageMaker, enterprises can deploy speech-to-text, text-to-speech, and voice agent capabilities with sub-second latency, all within their AWS environment. This collaboration extends SageMaker’s functionality and gives developers a powerful way to build and scale voice-driven applications securely and efficiently,” said Scott Stephenson, CEO and co-founder, Deepgram.
Native streaming via Amazon SageMaker
The integration supports clean, real-time inference via the SageMaker API, offering sub-second latency and enterprise-grade reliability. It is designed for high-scale use cases like contact centres, trading floors, and live analytics.
The solution is built to run on AWS and supports streaming responses via InvokeEndpointWithResponseStream, and keeps data within AWS.
Customers can deploy Deepgram within their Amazon Virtual Private Cloud (Amazon VPC) or as a managed service, for compliance with data residency requirements and offering flexible deployment options.

"Deepgram's new Amazon SageMaker AI integration makes it simple for customers to bring real-time voice capabilities into their AWS workflows," said Ankur Mehrotra, general manager for Amazon SageMaker at AWS, with the company backing the integration and Deepgram being an AWS Generative AI Competency Partner.
