legal-bert-base-cased-ptbr

Maintained By
dominguesm

legal-bert-base-cased-ptbr

PropertyValue
Parameter Count126M
Licensecc-by-4.0
Authordominguesm
Task TypeFill-Mask
LanguagePortuguese

What is legal-bert-base-cased-ptbr?

legal-bert-base-cased-ptbr is a specialized BERT language model trained specifically for Portuguese legal text analysis. Based on BERTimbau base architecture, this model has been fine-tuned on a comprehensive collection of Brazilian legal documents, including court decisions, petitions, and legal administrative texts.

Implementation Details

The model features 126M parameters and was trained on over 60,000 legal documents from the Brazilian Supreme Federal Tribunal. Training was conducted over 3 epochs with a batch size of 32, achieving a final evaluation loss of 0.47 and perplexity of 1.60.

  • Pre-trained on 61,309 miscellaneous legal documents
  • Includes various document types like petitions, sentences, and court orders
  • Implements MASK objective for contextual understanding

Core Capabilities

  • Fill-mask prediction for legal text completion
  • Contextual understanding of Portuguese legal terminology
  • Support for both PyTorch and TensorFlow frameworks
  • Optimized for legal domain tasks

Frequently Asked Questions

Q: What makes this model unique?

This model specializes in Portuguese legal language understanding, trained exclusively on Brazilian legal documents, making it particularly effective for legal text analysis and processing in Portuguese.

Q: What are the recommended use cases?

The model is ideal for legal text completion, document analysis, and legal research applications. It's particularly suited for tasks involving Brazilian legal documents and can assist in legal technology applications and computer law research.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.