It Is A Foundational Indic Speech to Text Model Trained on 1Million Hours, Under India AI Mission

FinTech BizNews Service
Mumbai, December 19, 2025: Gnani.ai today announced the launch of Vachana STT, a foundational, enterprise-grade Indic speech recognition model trained on over 1 million hours of real-world voice data. Vachana STT model forms a critical layer of Gnani.ai’s upcoming VoiceOS, a unified voice intelligence stack comprising foundational models across speech recognition, synthesis, understanding, and orchestration.

Vachana STT model delivers significantly lower word error rates across Indian languages on public datasets, setting a new baseline for speech AI built from India, for India, and at global scale. Being robust for a noisy and real-world omnichannel environment, keeps it ahead of Sovereign Speech AI models.
Vachana STT model is the first release of Gnani.ai’s VoiceOS series, which aims to deliver a complete, sovereign voice infrastructure stack built from first principles, rather than stitched-together APIs.
Trained over 1 million hours of proprietary multilingual datasets covering over 1056 domains, it offers off-the-self lowest error rate on diverse domains without any additional finetuning.
Industry-Leading Accuracy Across Indic Languages
Across extensive benchmarking on publicly available datasets and real-world omnichannel audio, Vachana STT ranks as the best-performing Indic STT (Speech-to-text) than leading providers, delivering 30 to 40 percent lower WER on low-resource languages and 10 to 20 percent lower WER on top 8 languages used in India.
Evaluations span Hindi, Bengali, Gujarati, Marathi, Punjabi, Tamil, Telugu, Kannada, Malayalam, Odia, Assamese, and additional Indic languages. Organizations can request detailed benchmarking reports and comparative evaluations directly from Gnani.ai.
Built for Production Scale
Vachana STT is engineered for environments where speech accuracy directly impacts automation rates, compliance, analytics quality, and customer experience. The platform supports real-time and batch transcription, integrates via enterprise-grade APIs, and is already deployed across BFSI, telecom, customer support, and large-scale voice automation systems, collectively processing approximately 10 million calls per day with latency metrics of p95 of 200ms.
Optimized for Telephony and Beyond
The model reliably handles compressed audio from 8 kbps to 64 kbps, variable network quality, and sustained high concurrency while maintaining predictable, low latency, making it suitable for agent assist, speech analytics, compliance monitoring, and voice-driven workflows.
IndiaAI Mission Selection
Vachana STT is released as part of Gnani.ai’s selection under the IndiaAI Mission, where the Government of India has identified a small set of high-potential startups to build sovereign foundational AI models from India. This selection underscores Gnani.ai’s focus on core AI infrastructure, not application-layer experimentation.
Availability and Access
Vachana STT is available immediately via API access for enterprise customers. Early adopters receive 100,000 free minutes of usage. For benchmarking data, technical evaluations, or API access, contact: hello@gnani.ai
Leadership Perspective
“Speech recognition in India is not a localization problem. It is a foundational systems problem,” said Ganesh Gopalan, Co-Founder and CEO of Gnani.ai. “Vachana STT is built as core infrastructure, trained on how India actually speaks, and designed to operate across channels, not just telephony. Being selected under the IndiaAI Mission reinforces our belief that foundational AI models must be built from India, with production reality at the center.”