Overview

Commotion Voice AI is a single gateway to eight speech and language APIs. Authenticate once with an X-API-Key header, get billed against your Voice AI credit wallet, and call any of the endpoints below.

Jump to the Quickstart to make your first call in under five minutes.

APIs

Speech to Text

Real-time streaming speech-to-text APIs

Text to Speech

Natural voice synthesis with multiple models

Transcriptions

Batch audio transcription with speaker diarization

Language Detection

Detect the language of a text input

Sentiment Analysis

Detect sentiment polarity and intensity from a user message

Intent Detection

Classify a user message against a set of caller-supplied intents

Post Call Analytics

Extract summary, intent, outcome and sentiment journey from a call transcript

Answering Machine Detection

Provision a streaming session and classify call pickup as human, voicemail, ivr, or unknown