nepaliBERT

Maintained By
Shushant

nepaliBERT

PropertyValue
Training Data Size4.6 GB
Model TypeMasked Language Model
Training HardwareTesla V100 GPU
Research PaperView Paper
Training Loss1.0495

What is nepaliBERT?

nepaliBERT is a groundbreaking masked language model specifically designed for the Nepali language, built on the BERT base architecture. This model represents a significant advancement in Nepali natural language processing, trained on approximately 4.6 GB of text data collected from various Nepali news sources and the OSCAR Nepali corpus. The training dataset encompasses over 85,467 news articles, making it a comprehensive resource for Nepali language understanding.

Implementation Details

The model was trained using Huggingface's Trainer API on a Tesla V100 GPU with 640 Tensor Cores at Kathmandu University's supercomputer facility. The training process took approximately 3 days, 8 hours, and 57 minutes to complete. The model achieved a remarkable perplexity score of 8.56, establishing it as a state-of-the-art performer for Devanagari language tasks.

  • Base Architecture: BERT base uncased
  • Training Corpus: 4.6 GB of Nepali text
  • Evaluation Set: 12 MB of textual data
  • Performance Metrics: 1.0495 loss, 8.56 perplexity

Core Capabilities

  • Masked Language Modeling for Nepali text
  • Superior performance in sentiment analysis of Nepali tweets
  • Handling of Devanagari script
  • General-purpose NLP tasks for Nepali language

Frequently Asked Questions

Q: What makes this model unique?

nepaliBERT is the state-of-the-art model for Devanagari language processing, trained on an extensive dataset of Nepali news articles. Its unique value lies in its specialized training for Nepali language understanding and its superior performance in various NLP tasks, particularly in sentiment analysis.

Q: What are the recommended use cases?

The model is designed for various NLP tasks related to Devanagari language processing, including but not limited to text classification, sentiment analysis, and masked language modeling. It's particularly effective for applications requiring deep understanding of Nepali text content.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.