ModernBERT-base-squad2-v0.2
Property | Value |
---|---|
Base Model | ModernBERT-base-nli |
Parameters | 149 million |
Context Length | 8,192 tokens |
Training Dataset | SQuAD v2 |
Performance | 83.96% exact match, 87.04% F1 |
Author | Praise2112 |
Model Hub | Hugging Face |
What is ModernBERT-base-squad2-v0.2?
ModernBERT-base-squad2-v0.2 is a specialized question-answering model fine-tuned on the SQuAD v2 dataset. Built upon the ModernBERT architecture, it leverages advanced features like Rotary Positional Embeddings (RoPE) and Local-Global Alternating Attention to handle long-form content effectively.
Implementation Details
The model incorporates several modern architectural improvements that enhance its performance and efficiency:
- Utilizes Rotary Positional Embeddings for superior long-context support
- Implements Local-Global Alternating Attention for efficient processing of long inputs
- Features unpadding and Flash Attention for optimized inference
- Trained with a maximum sequence length of 8,192 tokens
- Uses AdamW optimizer with learning rate 3e-05 and linear scheduler
Core Capabilities
- Advanced question-answering on long documents
- Handles context lengths up to 8,192 tokens
- Achieves 83.96% exact match accuracy on evaluation
- Efficient processing through modern attention mechanisms
- Suitable for document retrieval and semantic search tasks
Frequently Asked Questions
Q: What makes this model unique?
This model combines ModernBERT's advanced architecture with specific fine-tuning for question-answering tasks. Its 8,192 token context length and modern attention mechanisms make it particularly effective for long-document analysis.
Q: What are the recommended use cases?
The model excels in question-answering tasks, especially those involving long documents. It's ideal for applications in document retrieval, classification, and semantic search within large text corpora.