Sarvam AI LLMs, IndiaAI Mission and the Rise of Indigenous Large Language Models

26 Feb 2026

English

हिन्दी

Sarvam AI LLMs, IndiaAI Mission and the Rise of Indigenous Large Language Models

At the AI Impact Summit 2026, the Bengaluru-based startup Sarvam AI released two Large Language Models (LLMs).

  • The two models were trained on 35 billion and 105 billion parameters, respectively, and were less power- and compute-intensive than comparable models.

About Large Language Models (LLMs)

  • Large language models (LLMs) are advanced AI systems designed to understand and generate human-like text. 
    • They learn from vast amounts of written data to predict what comes next in a sentence or to create coherent responses to questions. 
  • Large Language ModelsArchitecture and Training: LLMs use deep learning with transformer architectures, like Generative Pre-trained Transformer (GPT), designed for processing sequential text data. 
    • They feature multiple neural network layers and an attention mechanism for context understanding.

Training of Large Language Models (LLMs)

  • Training Process: LLMs are trained on massive clusters of Graphics Processing Units (GPUs), which provide the computational power required to process vast amounts of data.
    • The model learns to predict the next word in a sentence based on the context provided by previous words.
    • Tokenization and Embeddings: Words are broken down into tokens, which are then converted into numerical embeddings representing the context.
    • Massive Text Corpora: LLMs are trained on extensive text data, allowing them to learn grammar, semantics, and conceptual relationships.
    • Learning Techniques: They use zero-shot and self-supervised learning to generalise from the data.
      • Zero-shot learning refers to a model’s ability to handle tasks or make predictions about data it has not seen during training.
    • Enhancing Accuracy: Performance is improved through prompt engineering, fine-tuning, and reinforcement learning with human feedback (RLHF) to address biases and inaccuracies.

Challenges in Training LLMs

  • Limited Capital: Since capital is scarce, efforts to train an LLM by Indian firms targeting Indian users can be challenging, especially if there is no immediate business use case for doing so.
    • For example, training a 70-billion-parameter LLM can cost around $6 million, a prohibitive amount for early-stage Indian startups without assured near-term returns.
  • High Capital Intensity: Training and operating LLMs requires expensive GPU clusters and massive electricity consumption, running into millions of dollars.
    • For Example: Training GPT-3 cost over $4–5 million in compute, while GPT-4 reportedly required tens of millions of dollars and thousands of GPUs running for months.
  • Scarcity of Indian Language Data: Internet data is dominated by English, European, Korean, and Japanese content, leaving Indian languages underrepresented.
    • For Example, English makes up over 50% of web content, while most Indian languages each account for less than 1%, leading to minimal representation in datasets like Common Crawl.
  • Performance Gap in Indian Languages:  Due to limited native datasets, LLMs often perform poorly in Indian languages compared to English.
  • Higher Token Consumption:  Many models translate Indian language inputs into English for better processing and then translate outputs back, increasing token usage and inference costs.
    • For Example: A 10-word English sentence may use around 12–15 tokens, whereas the same sentence in Hindi (Devanagari script) can consume 20–25 tokens due to tokenisation inefficiencies. 

Government Support and Institutional Push

  • IndiaAI Mission Subsidy:  The IndiaAI Mission has commissioned over 36,000 GPUs in Indian data centres (e.g., Yotta) to provide affordable compute access to researchers and startups.
  • Direct Support to Sarvam:  The government allocated 4,096 GPUs from its common compute cluster to Sarvam, with subsidies estimated at nearly ₹100 crore.
  • Ministry of Electronics and Information Technology (MeitY): It promotes domestic LLMs to build skilled talent in model training and to strengthen the overall Indian AI ecosystem in Indian languages and socio-cultural contexts.

About Mixture of Experts (MoE) 

  • Mixture of Experts (MoE) is a way of designing AI models so that only the necessary parts of the model are used for each question, instead of using the whole model every time.
  • For Example: 
    • Imagine a school with many teachers (experts).
    • If a student asks a maths question, only the maths teacher answers, not the history or science teachers.
  • Similarly:
    • In a normal AI model, all parts work for every question, which uses a lot of power and money.
    • In an MoE model, only a few specialised parts are activated, making it faster and cheaper.

Way Forward

  • Expand Indian Language Datasets: Create large, high-quality, annotated corpora in Hindi, Tamil, Bengali, Marathi and other Indian languages through public–private partnerships and initiatives like Bhashini.
  • Focused Sectoral Models: Develop smaller, domain-specific LLMs for governance, education, healthcare, agriculture, and law instead of only competing with frontier global models.
  • Industry Academia Collaboration: Strengthen partnerships between IITs, IIITs, startups, and MeitY to build skilled AI talent and research depth.
  • Energy Efficient Architectures:  Adopt approaches like Mixture of Experts (MoE) and model compression to reduce training and inference costs

Check Out UPSC CSE Books

Visit PW Store
online store 1

Indigenous LLM Efforts

  • BharatGen (IIT Bombay-incubated):  Trained a multilingual 17-billion parameter model aimed at sectors like education and healthcare.
  • Gnani.ai:  Launched a smaller text-to-speech model, focusing on speech-based AI applications.

Need help preparing for UPSC or State PSCs?

Connect with our experts to get free counselling & start preparing

Aiming for UPSC?

Download Our App

      
Quick Revise Now !
AVAILABLE FOR DOWNLOAD SOON
UDAAN PRELIMS WALLAH
Comprehensive coverage with a concise format
Integration of PYQ within the booklet
Designed as per recent trends of Prelims questions
हिंदी में भी उपलब्ध
Quick Revise Now !
UDAAN PRELIMS WALLAH
Comprehensive coverage with a concise format
Integration of PYQ within the booklet
Designed as per recent trends of Prelims questions
हिंदी में भी उपलब्ध

<div class="new-fform">







    </div>

    Subscribe our Newsletter
    Sign up now for our exclusive newsletter and be the first to know about our latest Initiatives, Quality Content, and much more.
    *Promise! We won't spam you.
    Yes! I want to Subscribe.