TexTeller
Property | Value |
---|---|
Parameter Count | 298M |
License | Apache 2.0 |
Tensor Type | F32 |
Downloads | 87,005 |
What is TexTeller?
TexTeller is a state-of-the-art Vision Transformer (ViT) based model designed for end-to-end formula recognition. Recently upgraded to version 2.0, it represents a significant advancement in the field of mathematical formula OCR, capable of converting formula images into LaTeX-style text with exceptional accuracy.
Implementation Details
Built using PyTorch and supporting ONNX runtime, TexTeller 2.0 leverages a vision-encoder-decoder architecture trained on an impressive 7.5M sample dataset - approximately 15 times larger than its predecessor. The model utilizes safetensors for efficient parameter storage and offers inference endpoints for practical deployment.
- 298M parameters for robust formula recognition
- Trained on OleehyO/latex-formulas dataset
- Supports complex multi-line formulas and matrices
- Implements image-to-text transformation pipeline
Core Capabilities
- Recognition of formulas in natural images
- Superior handling of rare mathematical symbols
- Accurate processing of complex matrix structures
- Enhanced generalization compared to alternatives
- Robust performance on multi-line equations
Frequently Asked Questions
Q: What makes this model unique?
TexTeller 2.0's uniqueness lies in its extensive training dataset of 7.5M samples, which is significantly larger than competitors like LaTeX-OCR (100K samples). This results in superior generalization and higher accuracy, especially for complex mathematical expressions.
Q: What are the recommended use cases?
The model is ideal for academic and professional applications requiring mathematical formula digitization, including digital textbook creation, research paper processing, and educational technology tools. It excels at handling both simple equations and complex mathematical structures.