doc-ufcn-generic-historical-line

Maintained By
Teklia

Doc-UFCN Generic Historical Line Detection Model

PropertyValue
LicenseMIT
FrameworkPyTorch
TaskImage Segmentation
PaperView Paper

What is doc-ufcn-generic-historical-line?

The Doc-UFCN generic historical line detection model is a specialized AI system designed to identify and extract text lines from historical documents. Developed by Teklia, this model has been trained on an extensive collection of 10 historical document datasets, including prominent collections like Bozen, cBAD2017, and ScribbleLens.

Implementation Details

The model processes images by scaling their largest dimension to 768 pixels while maintaining the original aspect ratio. It employs a unique training approach focused on reducing merger errors in predictions, making it particularly effective for handling diverse historical manuscript formats.

  • Trained on multiple public datasets including DIVA-HisDB and Horae
  • Optimized for 768-pixel maximum dimension processing
  • Implements advanced merger reduction techniques

Core Capabilities

  • Achieves 98.02% AP@.5 on ScribbleLens dataset
  • Robust performance across various historical document types
  • Effective text line detection with IoU scores ranging from 41.54% to 76.61%
  • Specialized handling of both simple and complex historical manuscripts

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its ability to process diverse historical documents through its training on 10 different datasets, making it highly versatile for various manuscript types. Its specialized training to reduce merger errors sets it apart from conventional text line detection models.

Q: What are the recommended use cases?

The model is ideal for digitization projects involving historical manuscripts, archival document processing, and research in historical document analysis. It's particularly effective for documents similar to those in the training datasets like Bozen, cBAD2017, and ScribbleLens.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.