ner-bert-base-cased-pt-lenerbr

ner-bert-base-cased-pt-lenerbr

pierreguillou

Portuguese legal NER model based on BERT, achieves 89.3% F1 score, specialized for legal entity recognition with strong performance on person, organization and temporal entities

PropertyValue
Task TypeToken Classification (NER)
LanguagePortuguese
F1 Score89.26%
Accuracy97.59%
Downloads108,927

What is ner-bert-base-cased-pt-lenerbr?

This is a specialized Named Entity Recognition (NER) model designed for the Portuguese legal domain. Built on BERT architecture, it was fine-tuned using the LeNER-Br dataset to identify and classify legal entities in Portuguese texts. The model demonstrates exceptional performance, particularly in recognizing person names (98.3% F1), temporal expressions (96.6% F1), and organizational entities (89.3% F1).

Implementation Details

The model was developed through a two-stage training process: first specializing the language model on legal texts, then fine-tuning for NER tasks. It uses a BERT base architecture with specialized tokenization for Portuguese legal terminology and achieves its results through careful hyperparameter optimization including a learning rate of 2e-5 and gradient accumulation steps of 2.

  • Trained on LeNER-Br dataset with legal domain focus
  • Implements transformer architecture with specialized Portuguese tokenization
  • Uses AdamW optimizer with linear learning rate scheduling
  • Trained for 10 epochs with batch size 4

Core Capabilities

  • Recognition of 6 entity types: JURISPRUDENCIA, LEGISLACAO, LOCAL, ORGANIZACAO, PESSOA, TEMPO
  • Optimal performance on person name recognition (98.7% precision)
  • Strong temporal expression detection (96.6% F1 score)
  • Efficient processing of legal documents with context awareness

Frequently Asked Questions

Q: What makes this model unique?

The model's specialization in Portuguese legal text sets it apart, with its two-stage training approach resulting in superior performance compared to standard BERT models. It achieves particularly high accuracy in person and temporal entity recognition, making it ideal for legal document processing.

Q: What are the recommended use cases?

The model is particularly suited for legal document analysis, court document processing, legal research automation, and any NLP tasks involving Portuguese legal texts. It excels at identifying legal references, organizations, and personal entities in legal contexts.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026