Sarvam-105B: India's Open-Source Reasoning Model Guide
Sarvam AI releases 105B and 30B open-source reasoning models under Apache 2.0, trained entirely in India. 128K context window powers the Indus AI assistant.
Parameters (Flagship)
Context Window
Indian Languages
Open-Source License
Key Takeaways
The global AI landscape has been dominated by a familiar cast of players: OpenAI and Anthropic in the United States, DeepSeek and Alibaba in China, Mistral in France. India, despite its massive technical talent pool and rapidly growing AI market, has been largely absent from the frontier model conversation. Sarvam AI changed that on March 6, 2026, when it released Sarvam-105B and Sarvam-30B, two open-source reasoning models trained entirely on Indian soil.
These are not incremental improvements or fine-tuned derivatives of existing Western models. Sarvam-105B is a 105-billion parameter reasoning model built from scratch with a 128K context window, competitive benchmark performance against frontier models, and native support for over 22 Indian languages. Released under the Apache 2.0 license, both models represent a deliberate statement about Indian AI sovereignty: the country can develop world-class AI infrastructure independently. For teams tracking the evolving landscape of AI and digital transformation, Sarvam's release signals that the open-source model ecosystem is becoming genuinely global.
What Is Sarvam-105B
Sarvam-105B is a 105-billion parameter open-source reasoning model developed by Sarvam AI, a Bangalore-based artificial intelligence company founded with the explicit mission of building India-first AI infrastructure. The model was trained entirely in India using domestic GPU clusters, making it the first frontier-class reasoning model to emerge from the Indian AI ecosystem.
The model supports a 128K token context window, enabling it to process long documents, codebases, and multi-turn conversations without truncation. It uses a decoder-only transformer architecture with mixture-of-experts (MoE) design, allowing efficient inference by activating only a subset of parameters for each token. This architectural choice means that while the total parameter count is 105 billion, the active parameters during inference are substantially lower, reducing compute costs per query.
A mixture-of-experts architecture with 105 billion total parameters. Only a fraction activate per token, balancing frontier-class capability with practical inference costs for production deployment.
Full 128K token context window handles long documents, extensive codebases, and complex multi-turn reasoning chains without information loss or truncation artifacts.
Fully open-source under Apache 2.0 with no usage restrictions. Commercial use, fine-tuning, and redistribution are all permitted without royalties or special agreements.
Sarvam AI was founded by Vivek Raghavan and Pratyush Kumar, both with deep roots in Indian AI research. The company raised significant funding specifically to build India-first AI infrastructure, rejecting the common approach of fine-tuning Western models for Indian languages. Their thesis is that genuine multilingual AI requires training from scratch with Indian language data woven into the pretraining corpus, not bolted on afterward.
Architecture and Training Infrastructure
The technical architecture of Sarvam-105B reflects deliberate decisions about how to build a competitive reasoning model within the constraints of Indian GPU infrastructure. The model uses a sparse mixture-of-experts (MoE) transformer architecture, where each token is routed to a subset of expert networks rather than passing through all parameters. This design enables the model to have 105 billion total parameters while keeping per-token compute costs comparable to a dense model roughly one-third the size.
Training was conducted entirely on GPU clusters located in India, a decision driven by both sovereignty goals and practical considerations around data residency. The training data included a carefully curated mixture of English text, Indian language corpora, code, and reasoning-focused datasets. Sarvam AI invested heavily in data collection and curation for Indian languages, building custom pipelines to process text from government documents, academic papers, news outlets, and web crawls across more than 22 languages.
Domestic GPU clusters
Trained entirely on Indian-hosted GPU infrastructure, maintaining full data sovereignty throughout the training pipeline from data collection to model checkpoint storage.
Sparse MoE architecture
Mixture-of-experts routing activates roughly one-third of parameters per token, enabling frontier-scale capability with practical inference costs on available hardware.
Multilingual pretraining corpus
Custom data pipelines processed text from 22+ Indian languages including government documents, academic papers, news, and web crawls to build a genuinely multilingual training dataset.
Reasoning-focused training stages
Multi-stage training process with dedicated reasoning phases using chain-of-thought data, mathematical proofs, code generation tasks, and logical deduction exercises.
The training process used a multi-stage approach. The initial pretraining phase focused on general language understanding across English and Indian languages. Subsequent stages progressively introduced reasoning-specific data including mathematical problem sets, code generation tasks, logical deduction exercises, and chain-of-thought reasoning traces. This staged approach follows patterns established by models like DeepSeek V3 but adapts them for a multilingual context where reasoning capability needs to transfer across language boundaries.
A key innovation in Sarvam's training pipeline is cross-lingual reasoning alignment. The model was explicitly trained to maintain reasoning quality when switching between languages mid-conversation or when the reasoning task is presented in one language and the response is expected in another. This is a harder problem than monolingual reasoning and required specific data curation strategies that Western labs have generally not prioritized.
Sarvam-30B: The Efficient Alternative
Alongside the flagship 105B model, Sarvam AI released Sarvam-30B, a 30-billion parameter dense model that retains approximately 85% of the larger model's reasoning capability. While Sarvam-105B demonstrates what Indian AI infrastructure can produce at frontier scale, Sarvam-30B addresses the practical question of deployment cost and accessibility.
- 105B total parameters (MoE architecture)
- 128K token context window
- Best-in-class multilingual reasoning
- Requires multi-GPU deployment (4-8x A100/H100)
- Ideal for research and maximum performance
- 30B parameters (dense architecture)
- 128K token context window
- ~85% of flagship reasoning capability
- Runs on a single A100 or 2x A10G GPUs
- Ideal for production and cost-sensitive deployments
The 30B model uses a dense transformer architecture rather than the MoE design of the 105B variant. This means all 30 billion parameters are active during inference, but the smaller total size makes it deployable on a single A100 GPU or a pair of A10G GPUs. For Indian startups and enterprises that cannot justify the cost of multi-GPU clusters, Sarvam-30B makes frontier-adjacent multilingual reasoning accessible at a fraction of the infrastructure cost.
The efficiency gains are substantial. At the 30B parameter scale, inference latency drops significantly compared to the 105B model, making real-time applications like chatbots, code assistants, and customer support systems practical. The model also supports quantization down to 4-bit precision with minimal quality degradation, further reducing deployment requirements. For teams comparing efficient model options, the 30B variant sits in a similar deployment class to models like Qwen 3.5 models but with dramatically better Indian language performance.
Reasoning Benchmarks and Performance
Sarvam-105B was evaluated against standard AI benchmarks and shows competitive performance with established frontier models. The benchmark results position it as a credible alternative for organizations that need strong reasoning combined with Indian language support, though it does not uniformly surpass the leading Western and Chinese models on every English-centric metric.
The most significant benchmark results are in the multilingual categories. On IndicBench and similar Indian language evaluation suites, Sarvam-105B substantially outperforms models like GPT-4, Claude, and Gemini that were primarily trained on English-centric data. This is not surprising given the training data composition, but it validates the thesis that building for Indian languages from the ground up produces fundamentally better results than fine-tuning English-first models.
On English-centric benchmarks like MMLU, GSM8K, and HumanEval, Sarvam-105B performs competitively with models in the 70-100B parameter class. It does not claim to beat GPT-4o or Claude Sonnet on pure English reasoning, and Sarvam AI has been transparent about this positioning. The value proposition is not “better than Western models at everything” but rather “competitive on English tasks while being dramatically better for Indian language workloads.”
Key insight: Sarvam-105B's most compelling advantage is cross-lingual reasoning, the ability to accept a reasoning problem in Hindi and produce a step-by-step solution in Tamil, or to analyze a Telugu document and generate insights in English. This capability gap between Sarvam and Western models is far larger than any gap on English-only benchmarks.
Multilingual Indian Language Support
India has 22 officially recognized languages and hundreds of additional dialects spoken by over 1.4 billion people. Most frontier AI models treat Indian languages as an afterthought, adding support through translation layers or limited fine-tuning that produces stilted, unnatural output. Sarvam-105B takes a fundamentally different approach by incorporating Indian language data directly into pretraining.
Hindi, Bengali, Marathi, Gujarati, Punjabi, Odia, Assamese, Urdu, Sindhi, and Nepali. These cover the largest speaker populations across Northern and Central India with native-quality generation and comprehension.
Tamil, Telugu, Kannada, and Malayalam. These four South Indian languages have distinct scripts and grammar structures. Sarvam-105B handles their agglutinative morphology natively rather than through transliteration.
Real-world Indian communication frequently mixes languages (Hinglish, Tanglish). Sarvam-105B handles code-switching naturally because the training data includes authentic mixed-language text from web and social media sources.
The practical difference between Sarvam's native multilingual approach and the fine-tuning approach used by Western models is stark. When asked to explain a complex technical concept in Tamil, a fine-tuned English model typically translates English reasoning into Tamil, producing grammatically correct but unnatural output. Sarvam-105B reasons directly in Tamil, producing explanations that read as if written by a fluent Tamil speaker. This distinction matters enormously for production applications serving Indian users.
Code-switching support deserves special attention. In everyday Indian communication, people routinely mix Hindi and English (Hinglish), Tamil and English (Tanglish), and other language combinations. Western models generally struggle with this because their training data does not include sufficient code-switched text. Sarvam-105B handles these mixed-language inputs fluently because the pretraining corpus was specifically curated to include authentic code-switched text from social media, messaging platforms, and web forums. The ability to process real-world Indian communication patterns, not just formal single-language text, is what makes these models practically useful for Indian market applications.
Indus AI Assistant and Deployment
Sarvam AI does not just release model weights and walk away. The company also ships Indus, a consumer-facing AI assistant powered by the Sarvam model family. Indus serves as both a product and a live demonstration of what these models can do in production, handling conversational AI tasks in Indian languages including information retrieval, content generation, code assistance, summarization, and general reasoning.
Consumer-facing conversational AI powered by Sarvam-105B and Sarvam-30B. Handles tasks in 22+ Indian languages with native fluency including real-time code-switching between Hindi and English.
Available as web application and mobile app for direct consumer access to Sarvam's reasoning capabilities.
Apache 2.0 licensing means full self-hosting on your own infrastructure. Model weights are available on Hugging Face for download, fine-tuning, and custom deployment.
Compatible with vLLM, TGI, and other standard inference frameworks for production deployment.
For developers and enterprises looking to deploy Sarvam models on their own infrastructure, both models are available on Hugging Face with full model weights and tokenizer files. They are compatible with standard inference frameworks including vLLM, Text Generation Inference (TGI), and llama.cpp for quantized deployments. Sarvam AI also provides an API for teams that prefer managed inference without operating their own GPU infrastructure.
The deployment flexibility is a direct consequence of the Apache 2.0 license. Unlike restricted-license models that prohibit commercial use above certain user thresholds or require revenue sharing, Sarvam's models can be deployed without any legal constraints. This is particularly important for Indian government agencies, financial institutions, and healthcare organizations that need on-premises deployment for regulatory compliance. The open-source model landscape continues to expand with similar permissive licensing, as seen with models like GLM-5 from Zhipu AI, which follows the same Apache 2.0 approach for the Chinese market.
Deployment recommendation: For most production use cases, start with Sarvam-30B. It runs on a single A100 or two A10G GPUs, supports 4-bit quantization with minimal quality loss, and provides roughly 85% of the 105B model's capability. Reserve Sarvam-105B for tasks where maximum reasoning performance justifies the multi-GPU infrastructure cost.
Indian AI Sovereignty Implications
Sarvam-105B is significant beyond its technical merits. It represents the first time a frontier-class reasoning model has been developed entirely within India, using Indian infrastructure, Indian talent, and data pipelines designed for Indian languages. This matters for several interconnected reasons that extend well beyond benchmark scores.
Training on domestic infrastructure means Indian government, healthcare, and financial data never needs to leave the country for AI processing. This addresses regulatory concerns that have limited AI adoption in sensitive Indian sectors.
India's AI ecosystem currently depends on US and Chinese model providers for frontier capabilities. Domestic model development reduces geopolitical risk and ensures continued access regardless of export controls or trade policies.
Building frontier models domestically creates experience and expertise that stays in India. The engineers, researchers, and MLOps teams at Sarvam AI are building institutional knowledge that benefits the broader Indian AI ecosystem.
Over 800 million Indians do not speak English as their primary language. AI models that work natively in Indian languages remove the English barrier that currently limits AI accessibility for the majority of the Indian population.
The Indian government has been increasingly vocal about AI sovereignty, with the Ministry of Electronics and Information Technology actively supporting domestic AI development through funding programs and compute infrastructure investments. Sarvam-105B aligns directly with these national priorities. It provides a concrete example of what Indian-built AI infrastructure can achieve, strengthening the case for continued investment in domestic capabilities.
The open-source release strategy amplifies the sovereignty impact. By releasing under Apache 2.0, Sarvam AI ensures that the models cannot be restricted or revoked. Any Indian organization can download, deploy, and build on these models permanently, regardless of what happens to Sarvam AI as a company or how geopolitical relationships evolve. This is infrastructure that, once released, cannot be taken back. The pattern mirrors the broader shift toward open-source frontier models globally, where organizations like DeepSeek and Mistral have demonstrated that open weights can coexist with commercial viability.
The competitive dynamics are also worth noting. India is the world's largest democracy and its fifth-largest economy. Having domestic frontier AI capability positions India as a third pole in global AI development alongside the US and China. For international businesses operating in India, Sarvam's models offer a compliance-friendly option that keeps data within Indian jurisdiction while providing reasoning capability comparable to foreign alternatives. This is not just a technical achievement but a strategic one that reshapes how the global AI map looks.
Conclusion
Sarvam-105B and Sarvam-30B mark a genuine milestone in global AI development. For the first time, a frontier-class reasoning model family has been trained entirely in India, released under a permissive open-source license, and designed from the ground up for the multilingual reality of the Indian subcontinent. The technical achievement is real: competitive English reasoning performance, best-in-class Indian language support, a 128K context window, and efficient deployment options through the 30B variant.
The broader significance is equally important. Sarvam's release demonstrates that frontier AI development is no longer the exclusive domain of US and Chinese labs. Indian infrastructure, talent, and data pipelines can produce world-class models that serve the specific needs of 1.4 billion people in their native languages. For organizations operating in the Indian market, these models offer a combination of reasoning capability, language support, and deployment flexibility that no Western or Chinese model currently matches.
Whether you deploy Sarvam models directly or simply track their progress as the Indian AI ecosystem matures, the message is clear: the era of a few dominant players controlling frontier AI is ending. Open-source models from diverse geographies are creating a more distributed, accessible, and culturally relevant AI landscape. Sarvam-105B is not just an Indian achievement; it is evidence that this distributed future is already arriving.
Ready to Leverage AI for Your Business?
Open-source AI models are transforming what's possible for businesses of every size. Our team helps you evaluate, deploy, and integrate the right AI solutions for measurable results.
Related Articles
Continue exploring with these related guides