CallMissed Docs

API Guides & Tutorials

SPEECH TO TEXT

Transcribe audio to text using Sarvam AI's saaras model with 22 Indic language support.

Overview

The Speech to Text API transcribes audio files into text. Powered by Sarvam AI's saaras:v3 model with support for 22 Indian languages + English. Supports auto language detection.

Endpoint: POST /v1/audio/transcriptions

Basic Usage

from openai import OpenAI

client = OpenAI(
    api_key="cm_your_key",
    base_url="https://api.callmissed.com/v1"
)

with open("audio.wav", "rb") as f:
    response = client.audio.transcriptions.create(
        model="saaras:v3",
        file=f
    )

print(response.text)

Parameters

Parameter	Type	Description
`model`	string	`saaras:v3`
`file`	file	Audio file (WAV, MP3, etc.)
`language`	string	Language code (auto-detected if omitted)
`mode`	string	Output mode — see below
`response_format`	string	`json`, `text`, or `verbose_json`
`timestamp_granularities[]`	array	`["word"]` for word-level timestamps (OpenAI-compatible)

Output Modes

Mode	Description
`transcribe`	Standard transcription (default)
`translate`	Transcribe and translate to English
`verbatim`	Exact transcription including filler words
`translit`	Transliteration to Latin script
`codemix`	Code-mixed output (Indic + English)

Was this page helpful?