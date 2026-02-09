In a post on X, Sarvam AI's co‑founder Pratyush Kumar said, "Sarvam Vision achieves state-of-the-art accuracy of 84.3 per cent on the olmOCR-Bench (English only subset) outperforming frontier models like Gemini 3 Pro and recent OCR models like DeepSeek OCR 2."

On OmniDocBench v1.5 (English only subset), Sarvam Vision achieved 93.28 per cent overall score, excelling in complex formulas and layout parsing and being within touching distance of the current state of the art, Kumar added.

Kumar also said the company’s Bulbul V3 text‑to‑speech model supports 35 voices across all 22 scheduled Indian languages and can handle different quality scans and content.

"On Indian languages, Sarvam Vision is the best model by far, while supporting all 22 scheduled Indian languages," he claimed.