journaux-lm-v1

journaux-lm-v1

PleIAs

A French language model trained on historical newspapers using ELECTRA architecture and TEAMS approach, optimized for Named Entity Recognition with 408GB training data.

PropertyValue
LicenseApache 2.0
LanguageFrench
Training Data Size408GB
ArchitectureELECTRA with TEAMS approach

What is journaux-lm-v1?

Journaux-LM-v1 is a specialized French language model designed for processing historical newspapers. Built on the ELECTRA architecture and trained using the TEAMS approach, this model represents a significant advancement in handling historical French text documents. The model was trained on the comprehensive PleIAs/French-PD-Newspapers dataset, encompassing 408GB of historical French newspaper content.

Implementation Details

The model implements an ELECTRA architecture enhanced with TEAMS (Token-level Ensemble Approach for Modeling Sequences) methodology. It has been specifically optimized for Named Entity Recognition (NER) tasks, demonstrating superior performance compared to the French Europeana BERT model across multiple benchmark datasets.

  • Trained on PleIAs/French-PD-Newspapers dataset
  • Implements TEAMS approach for enhanced sequence modeling
  • Optimized for historical text processing
  • Achieves state-of-the-art performance on multiple NER benchmarks

Core Capabilities

  • Named Entity Recognition with average 77.17% F1-score on test sets
  • Specialized processing of historical French texts
  • Improved performance over existing French language models
  • Handles various historical document types and formats

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its specialized focus on historical French newspapers, utilizing the TEAMS approach with ELECTRA architecture. It consistently outperforms the French Europeana BERT model, showing improvements of up to 1.12% on development sets and 0.98% on test sets for NER tasks.

Q: What are the recommended use cases?

The model is particularly well-suited for: Named Entity Recognition in historical French texts, processing of historical newspaper content, and analysis of public domain French language materials. It excels in tasks requiring understanding of historical context and language patterns.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026