Voice cloning
Voice cloning takes a short audio sample and returns a voice_id you can pass to /ai/text-to-speech for any future synthesis.
-
Capture a clean sample.
Record 10 seconds of a single speaker in a quiet environment. Use one of WAV, MP3, M4A, or FLAC at 16 kHz or higher. See Audio inputs for full guidance.
-
Send it to
/ai/voice-clone.Terminal window curl -X POST https://api.gocommotion.com/ai/voice-clone \-H "X-API-Key: eak_live_your_key_here" \-F "file=@sample.wav" \-F "name=narrator-en"import requestswith open("sample.wav", "rb") as audio:response = requests.post("https://api.gocommotion.com/ai/voice-clone",headers={"X-API-Key": "eak_live_your_key_here"},files={"file": audio},data={"name": "narrator-en"},)voice_id = response.json()["voice_id"]The response includes a stable
voice_idlikevc_01HXYZ.... -
Use the
voice_idin synthesis.Terminal window curl -X POST https://api.gocommotion.com/ai/text-to-speech \-H "X-API-Key: eak_live_your_key_here" \-H "Content-Type: application/json" \-d '{"text": "Hello from your cloned voice.","voice_id": "vc_01HXYZ..."}' \--output greeting.mp3