CallMissed Docs

API Guides & Tutorials

TEXT TO SPEECH

Convert text to natural-sounding speech using Sarvam AI or ElevenLabs.

Overview

The Text to Speech API converts text into audio. Choose between Sarvam AI (Indic languages, 39 voices) or ElevenLabs (multilingual, premium voices).

Endpoint: POST /v1/audio/speech

Basic Usage

from openai import OpenAI

client = OpenAI(
    api_key="cm_your_key",
    base_url="https://api.callmissed.com/v1"
)

response = client.audio.speech.create(
    model="bulbul:v3",
    voice="shubh",
    input="Namaste, kaise hain aap?"
)

response.stream_to_file("speech.mp3")

Parameters

Parameter	Type	Description
`model`	string	`bulbul:v3` (Sarvam) or `eleven_multilingual_v2` / `eleven_flash_v2.5` (ElevenLabs)
`input`	string	Text to synthesize
`voice`	string	Voice ID — default `shubh` for Sarvam (39 voices available)
`language`	string	Language code (e.g. `hi-IN`, `ta-IN`)
`speed`	number	Speech speed 0.5–2.0 (default 1.0) — maps to Sarvam's `pace`
`speech_sample_rate`	integer	8000, 16000, 22050, 24000, or 48000 Hz
`response_format`	string	Output format — see below

Audio Formats

Supported values for response_format: mp3, opus, aac, flac, wav, pcm

Was this page helpful?