API Guides & Tutorials
TEXT TO SPEECH
Convert text to natural-sounding speech using Sarvam AI or ElevenLabs.
Overview
The Text to Speech API converts text into audio. Choose between Sarvam AI (Indic languages, 39 voices) or ElevenLabs (multilingual, premium voices).
Endpoint: POST /v1/audio/speech
Basic Usage
from openai import OpenAI
client = OpenAI(
api_key="cm_your_key",
base_url="https://api.callmissed.com/v1"
)
response = client.audio.speech.create(
model="bulbul:v3",
voice="shubh",
input="Namaste, kaise hain aap?"
)
response.stream_to_file("speech.mp3")Parameters
| Parameter | Type | Description |
|---|---|---|
model | string | bulbul:v3 (Sarvam) or eleven_multilingual_v2 / eleven_flash_v2.5 (ElevenLabs) |
input | string | Text to synthesize |
voice | string | Voice ID β default shubh for Sarvam (39 voices available) |
language | string | Language code (e.g. hi-IN, ta-IN) |
speed | number | Speech speed 0.5β2.0 (default 1.0) β maps to Sarvam's pace |
speech_sample_rate | integer | 8000, 16000, 22050, 24000, or 48000 Hz |
response_format | string | Output format β see below |
Audio Formats
Supported values for response_format: mp3, opus, aac, flac, wav, pcm
Was this page helpful?