IndicNER

Maintained By
ai4bharat

IndicNER

PropertyValue
LicenseMIT
PaperView Paper
Downloads478,117
Languages Supported11 Indian Languages

What is IndicNER?

IndicNER is a sophisticated Named Entity Recognition (NER) model specifically designed for Indian languages. Built on the bert-base-multilingual-uncased architecture, it has been fine-tuned on the Naamapadam dataset derived from the Samanantar Corpus to identify named entities in 11 different Indian languages: Assamese, Bengali, Gujarati, Hindi, Kannada, Malayalam, Marathi, Oriya, Punjabi, Tamil, and Telugu.

Implementation Details

The model leverages transformer architecture and PyTorch framework for implementation. It was trained on millions of sentences from the Samanantar corpus and has been benchmarked against human-annotated test sets and various public Indian NER datasets.

  • Base Architecture: BERT Multilingual Uncased
  • Training Dataset: Naamapadam (derived from Samanantar)
  • Framework: PyTorch
  • Evaluation: Human-annotated testsets

Core Capabilities

  • Multi-language NER processing across 11 Indian languages
  • Token classification for named entity identification
  • Efficient processing of large-scale text data
  • Integration capability with existing NLP pipelines

Frequently Asked Questions

Q: What makes this model unique?

IndicNER stands out for its comprehensive coverage of Indian languages and its training on the extensive Samanantar corpus, making it particularly effective for Indian language NER tasks. The model's architecture and training approach make it well-suited for production environments requiring multilingual NER capabilities.

Q: What are the recommended use cases?

The model is ideal for applications requiring named entity recognition in Indian languages, such as information extraction, text analysis, and content classification. It's particularly useful for organizations working with multilingual Indian content and requiring robust NER capabilities.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.