NER-Indian-xlm-roberta

NER-Indian-xlm-roberta

Venkatesh4342

A specialized NER model fine-tuned on XLM-RoBERTa for Indian context, achieving 0.813 F1 score. 277M parameters, MIT licensed.

PropertyValue
Parameter Count277M
LicenseMIT
F1 Score0.813
FrameworkPyTorch

What is NER-Indian-xlm-roberta?

NER-Indian-xlm-roberta is a specialized Named Entity Recognition model fine-tuned from xlm-roberta-base, specifically optimized for Indian context. The model utilizes sentence-piece tokenizer and achieves an impressive F1 score of 0.813 on evaluation tasks.

Implementation Details

The model was trained using PyTorch with Native AMP mixed precision training. It employs the Adam optimizer with carefully tuned hyperparameters (betas=0.9,0.999, epsilon=1e-08) and a linear learning rate scheduler.

  • Training batch size: 96 (32 x 3 gradient accumulation steps)
  • Learning rate: 5e-05
  • Training epochs: 4
  • Final validation loss: 0.1404

Core Capabilities

  • Specialized NER processing for Indian context
  • Proper handling of sentences with appropriate delimiters
  • Support for proper capitalization
  • Token classification with high accuracy

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its specific optimization for Indian context NER tasks, using a sentence-piece tokenizer and achieving high F1 scores. It's built on the robust xlm-roberta-base architecture while being specifically tuned for Indian named entity recognition.

Q: What are the recommended use cases?

The model is best suited for named entity recognition tasks in Indian contexts, particularly when working with properly formatted text that includes appropriate delimiters and capitalization. It's ideal for applications requiring accurate entity recognition in Indian languages and contexts.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026