API Guides & Tutorials

SPEECH TO TEXT

Transcribe audio to text using Sarvam AI's saaras model with 22 Indic language support.

Overview

The Speech to Text API transcribes audio files into text. Powered by Sarvam AI's saaras:v3 model with support for 22 Indian languages + English. Supports auto language detection.

Endpoint: POST /v1/audio/transcriptions

Basic Usage

from openai import OpenAI

client = OpenAI(
    api_key="cm_your_key",
    base_url="https://api.callmissed.com/v1"
)

with open("audio.wav", "rb") as f:
    response = client.audio.transcriptions.create(
        model="saaras:v3",
        file=f
    )

print(response.text)

Parameters

ParameterTypeDescription
modelstringsaaras:v3
filefileAudio file (WAV, MP3, etc.)
languagestringLanguage code (auto-detected if omitted)
modestringOutput mode β€” see below
response_formatstringjson, text, or verbose_json
timestamp_granularities[]array["word"] for word-level timestamps (OpenAI-compatible)

Output Modes

ModeDescription
transcribeStandard transcription (default)
translateTranscribe and translate to English
verbatimExact transcription including filler words
translitTransliteration to Latin script
codemixCode-mixed output (Indic + English)
Was this page helpful?