camembert-base-squadFR-fquad-piaf
Property | Value |
---|---|
Base Model | CamemBERT-base |
Task | French Question-Answering |
Training Data | PIAF v1.1, FQuAD v1.0, SQuAD-FR |
F1 Score (FQuAD) | 79.81% |
Exact Match (FQuAD) | 55.14% |
Author | AgentPublic |
What is camembert-base-squadFR-fquad-piaf?
This is a specialized French language question-answering model that builds upon the CamemBERT base architecture. It's been fine-tuned on three major French QA datasets: PIAF v1.1, FQuAD v1.0, and the French translation of SQuAD, creating a robust model for French language question-answering tasks. The model demonstrates strong performance with F1 scores around 80% on both FQuAD and SQuAD-FR evaluation sets.
Implementation Details
The model utilizes the CamemBERT architecture with specific fine-tuning parameters including a learning rate of 3e-5, batch size of 12, and 4 training epochs. It implements a maximum sequence length of 384 tokens with a document stride of 128, optimized for handling long-form question-answering scenarios.
- Trained using HuggingFace's transformers library
- Optimized hyperparameters for French QA tasks
- Supports context windows up to 384 tokens
- Achieves balanced performance across multiple French QA datasets
Core Capabilities
- Extracts precise answers from French text passages
- Handles both factoid and descriptive questions
- Processes various types of French language content
- Achieves 79.81% F1 score on FQuAD and 80.61% on SQuAD-FR
Frequently Asked Questions
Q: What makes this model unique?
This model stands out for its comprehensive training on three different French QA datasets, making it particularly robust for French language question-answering tasks. The combination of PIAF, FQuAD, and SQuAD-FR provides diverse training examples that help the model handle various question types and contexts.
Q: What are the recommended use cases?
The model is ideal for French language applications requiring question-answering capabilities, such as customer service automation, information extraction from French documents, and educational tools. It's particularly effective for extracting specific information from longer text passages.