ModernBERT-base-squad2-v0.2

Property	Value
Base Model	ModernBERT-base-nli
Parameters	149 million
Context Length	8,192 tokens
Training Dataset	SQuAD v2
Performance	83.96% exact match, 87.04% F1
Author	Praise2112
Model Hub	Hugging Face

What is ModernBERT-base-squad2-v0.2?

ModernBERT-base-squad2-v0.2 is a specialized question-answering model fine-tuned on the SQuAD v2 dataset. Built upon the ModernBERT architecture, it leverages advanced features like Rotary Positional Embeddings (RoPE) and Local-Global Alternating Attention to handle long-form content effectively.

Implementation Details

The model incorporates several modern architectural improvements that enhance its performance and efficiency:

Utilizes Rotary Positional Embeddings for superior long-context support
Implements Local-Global Alternating Attention for efficient processing of long inputs
Features unpadding and Flash Attention for optimized inference
Trained with a maximum sequence length of 8,192 tokens
Uses AdamW optimizer with learning rate 3e-05 and linear scheduler

Core Capabilities

Advanced question-answering on long documents
Handles context lengths up to 8,192 tokens
Achieves 83.96% exact match accuracy on evaluation
Efficient processing through modern attention mechanisms
Suitable for document retrieval and semantic search tasks

Frequently Asked Questions

Q: What makes this model unique?

This model combines ModernBERT's advanced architecture with specific fine-tuning for question-answering tasks. Its 8,192 token context length and modern attention mechanisms make it particularly effective for long-document analysis.

Q: What are the recommended use cases?

The model excels in question-answering tasks, especially those involving long documents. It's ideal for applications in document retrieval, classification, and semantic search within large text corpora.