API Guides & Tutorials

TEXT TO SPEECH

Convert text to natural-sounding speech using Sarvam AI or ElevenLabs.

Overview

The Text to Speech API converts text into audio. Choose between Sarvam AI (Indic languages, 39 voices) or ElevenLabs (multilingual, premium voices).

Endpoint: POST /v1/audio/speech

Basic Usage

from openai import OpenAI

client = OpenAI(
    api_key="cm_your_key",
    base_url="https://api.callmissed.com/v1"
)

response = client.audio.speech.create(
    model="bulbul:v3",
    voice="shubh",
    input="Namaste, kaise hain aap?"
)

response.stream_to_file("speech.mp3")

Parameters

ParameterTypeDescription
modelstringbulbul:v3 (Sarvam) or eleven_multilingual_v2 / eleven_flash_v2.5 (ElevenLabs)
inputstringText to synthesize
voicestringVoice ID β€” default shubh for Sarvam (39 voices available)
languagestringLanguage code (e.g. hi-IN, ta-IN)
speednumberSpeech speed 0.5–2.0 (default 1.0) β€” maps to Sarvam's pace
speech_sample_rateinteger8000, 16000, 22050, 24000, or 48000 Hz
response_formatstringOutput format β€” see below

Audio Formats

Supported values for response_format: mp3, opus, aac, flac, wav, pcm

Was this page helpful?