InCaseLawBERT

Property	Value
Parameters	110M
License	MIT
Paper	Pre-training Transformers on Indian Legal Text
Training Data	5.4M Indian legal documents

What is InCaseLawBERT?

InCaseLawBERT is a specialized BERT model designed specifically for Indian legal text processing. Built upon the foundation of Legal-BERT, this model has been extensively trained on a massive corpus of 5.4 million Indian legal documents spanning from 1950 to 2019, encompassing various legal domains including Civil, Criminal, and Constitutional law.

Implementation Details

The model follows the bert-base-uncased architecture with 12 hidden layers, 768 hidden dimensions, and 12 attention heads. It was further trained for 300K steps using Masked Language Modeling (MLM) and Next Sentence Prediction (NSP) tasks on a 27GB corpus of legal text.

Architecture: BERT-base configuration
Training Corpus: 5.4M Indian legal documents
Training Tasks: MLM and NSP
Base Model: CaseLawBERT

Core Capabilities

Legal Statute Identification
Semantic Segmentation of Legal Documents
Court Judgment Prediction
Legal Text Embeddings Generation

Frequently Asked Questions

Q: What makes this model unique?

InCaseLawBERT is specifically optimized for Indian legal text processing, trained on one of the largest Indian legal corpora available. It maintains performance comparable to CaseLawBERT while being specifically adapted to Indian legal contexts.

Q: What are the recommended use cases?

The model excels in legal document analysis tasks including statute identification, semantic segmentation, and judgment prediction. It's particularly well-suited for applications involving Indian legal documents and can be fine-tuned for specific legal NLP tasks.

InCaseLawBERT

InCaseLawBERT

What is InCaseLawBERT?

Implementation Details

Core Capabilities

Frequently Asked Questions

Q: What makes this model unique?

Q: What are the recommended use cases?

Related Models