nepaliBERT
Property | Value |
---|---|
Training Data Size | 4.6 GB |
Model Type | Masked Language Model |
Training Hardware | Tesla V100 GPU |
Research Paper | View Paper |
Training Loss | 1.0495 |
What is nepaliBERT?
nepaliBERT is a groundbreaking masked language model specifically designed for the Nepali language, built on the BERT base architecture. This model represents a significant advancement in Nepali natural language processing, trained on approximately 4.6 GB of text data collected from various Nepali news sources and the OSCAR Nepali corpus. The training dataset encompasses over 85,467 news articles, making it a comprehensive resource for Nepali language understanding.
Implementation Details
The model was trained using Huggingface's Trainer API on a Tesla V100 GPU with 640 Tensor Cores at Kathmandu University's supercomputer facility. The training process took approximately 3 days, 8 hours, and 57 minutes to complete. The model achieved a remarkable perplexity score of 8.56, establishing it as a state-of-the-art performer for Devanagari language tasks.
- Base Architecture: BERT base uncased
- Training Corpus: 4.6 GB of Nepali text
- Evaluation Set: 12 MB of textual data
- Performance Metrics: 1.0495 loss, 8.56 perplexity
Core Capabilities
- Masked Language Modeling for Nepali text
- Superior performance in sentiment analysis of Nepali tweets
- Handling of Devanagari script
- General-purpose NLP tasks for Nepali language
Frequently Asked Questions
Q: What makes this model unique?
nepaliBERT is the state-of-the-art model for Devanagari language processing, trained on an extensive dataset of Nepali news articles. Its unique value lies in its specialized training for Nepali language understanding and its superior performance in various NLP tasks, particularly in sentiment analysis.
Q: What are the recommended use cases?
The model is designed for various NLP tasks related to Devanagari language processing, including but not limited to text classification, sentiment analysis, and masked language modeling. It's particularly effective for applications requiring deep understanding of Nepali text content.