IndicBERTv2-MLM-only
Property | Value |
---|---|
Parameters | 278M |
License | MIT |
Languages Supported | 26 |
Training Objective | Masked Language Modeling (MLM) |
What is IndicBERTv2-MLM-only?
IndicBERTv2-MLM-only is a state-of-the-art multilingual language model specifically designed for Indian languages. Developed by AI4Bharat, it's trained on IndicCorp v2 and supports 26 languages including various Indian languages and English. The model employs a vanilla BERT architecture with Masked Language Modeling (MLM) as its primary training objective.
Implementation Details
The model is built using the Transformers architecture and implemented in PyTorch. It features 278M parameters and is trained specifically for fill-mask tasks. The implementation supports various scripts including Devanagari, Bengali, Malayalam, Tamil, and others, making it truly versatile for Indian language processing.
- Trained on comprehensive IndicCorp v2 dataset
- Supports 26 different languages and their respective scripts
- Implements standard BERT architecture with MLM objective
- Available through Hugging Face's model hub
Core Capabilities
- Masked Language Modeling for 26 different languages
- Zero-shot cross-lingual transfer
- Support for multiple Indian scripts and writing systems
- Fine-tuning capabilities for various downstream tasks
- Inference endpoints available for production deployment
Frequently Asked Questions
Q: What makes this model unique?
The model's unique strength lies in its comprehensive coverage of Indian languages and scripts, making it one of the largest multilingual models specifically designed for Indian languages. With 278M parameters and support for 26 languages, it provides robust performance across various Indian language processing tasks.
Q: What are the recommended use cases?
The model is particularly well-suited for tasks such as text completion, language understanding, and masked word prediction across Indian languages. It can be fine-tuned for specific downstream tasks including NER, paraphrase detection, question answering, sentiment analysis, and more.