Speech to Text
Real-time streaming speech-to-text APIs
Commotion Voice AI is a single gateway to eight speech and language APIs. Authenticate once with an X-API-Key header, get billed against your Voice AI credit wallet, and call any of the endpoints below.
Jump to the Quickstart to make your first call in under five minutes.
Speech to Text
Real-time streaming speech-to-text APIs
Text to Speech
Natural voice synthesis with multiple models
Transcriptions
Batch audio transcription with speaker diarization
Language Detection
Detect the language of a text input
Sentiment Analysis
Detect sentiment polarity and intensity from a user message
Intent Detection
Classify a user message against a set of caller-supplied intents
Post Call Analytics
Extract summary, intent, outcome and sentiment journey from a call transcript
Answering Machine Detection
Provision a streaming session and classify call pickup as human, voicemail, ivr, or unknown