Sarvam AI Launches New Language Model for Indian Languages

Sarvam AI, a rising player in India’s generative AI sector, has introduced a new language model named Sarvam-1. This model is specifically designed for Indian languages and is open-source. It supports ten Indian languages, including Bengali, Hindi, and Tamil, as well as English. Launched in October 2024, Sarvam-1 follows the company’s earlier model, Sarvam 2B, which debuted in August 2024.

Overview of Sarvam-1

Sarvam-1 is developed with 2 billion parameters. Parameters indicate the complexity and capability of an AI model. For comparison, Microsoft’s Phi-3 Mini has 3.8 billion parameters. Sarvam-1 is classified as a small language model (SLM) because it has fewer than ten billion parameters. This contrasts with large language models (LLMs) like OpenAI’s GPT-4, which has over a trillion parameters.

Technical Specifications

Sarvam-1 is powered by 1,024 Graphics Processing Units (GPUs) from Yotta and trained using NVIDIA’s NeMo framework. The model addresses a major challenge: the lack of high-quality training data for Indian languages. Existing datasets often lack the necessary depth and diversity. To overcome this, Sarvam AI created its training corpus, Sarvam-2T.

Training Data

Sarvam-2T consists of an estimated 2 trillion tokens. The dataset is evenly distributed across all ten supported languages. It includes synthetic data generation techniques to enhance the quality of the training data. Approximately 20% of the dataset is in Hindi, with considerable portions in English and programming languages. This diversity aids the model in performing both monolingual and multilingual tasks.

Performance Metrics

Sarvam-1 is reported to be more efficient in handling Indic language scripts than previous LLMs. It uses fewer tokens per word, which contributes to its efficiency. The model has surpassed larger AI models like Meta’s Llama-3 and Google’s Gemma-2 on several benchmarks, including MMLU and ARC-Challenge.

Benchmark Achievements

On the TriviaQA benchmark, Sarvam-1 achieved an accuracy of 86.11 for Indic languages. This score exceeds that of Meta’s Llama-3.1 8B, which scored 61.47. Sarvam-1 also boasts computational efficiency, with inference speeds 4-6 times faster than larger models such as Gemma-2-9B and Llama-3.1-8B.

Practical Applications

The combination of strong performance and high inference efficiency makes Sarvam-1 suitable for practical applications, including deployment on edge devices. This is particularly important for real-world use cases where computational resources may be limited.

Sarvam-1 is available for download on Hugging Face, an online platform for open-source AI models. This accessibility allows developers and researchers to utilise the model for various applications involving Indian languages.


Month: 

Category: 

Leave a Reply

Your email address will not be published. Required fields are marked *