NER-Indian-xlm-roberta

Maintained By
Venkatesh4342

NER-Indian-xlm-roberta

PropertyValue
Parameter Count277M
LicenseMIT
F1 Score0.813
FrameworkPyTorch

What is NER-Indian-xlm-roberta?

NER-Indian-xlm-roberta is a specialized Named Entity Recognition model fine-tuned from xlm-roberta-base, specifically optimized for Indian context. The model utilizes sentence-piece tokenizer and achieves an impressive F1 score of 0.813 on evaluation tasks.

Implementation Details

The model was trained using PyTorch with Native AMP mixed precision training. It employs the Adam optimizer with carefully tuned hyperparameters (betas=0.9,0.999, epsilon=1e-08) and a linear learning rate scheduler.

  • Training batch size: 96 (32 x 3 gradient accumulation steps)
  • Learning rate: 5e-05
  • Training epochs: 4
  • Final validation loss: 0.1404

Core Capabilities

  • Specialized NER processing for Indian context
  • Proper handling of sentences with appropriate delimiters
  • Support for proper capitalization
  • Token classification with high accuracy

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its specific optimization for Indian context NER tasks, using a sentence-piece tokenizer and achieving high F1 scores. It's built on the robust xlm-roberta-base architecture while being specifically tuned for Indian named entity recognition.

Q: What are the recommended use cases?

The model is best suited for named entity recognition tasks in Indian contexts, particularly when working with properly formatted text that includes appropriate delimiters and capitalization. It's ideal for applications requiring accurate entity recognition in Indian languages and contexts.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.