TexTeller

Property	Value
Parameter Count	298M
License	Apache 2.0
Tensor Type	F32
Downloads	87,005

What is TexTeller?

TexTeller is a state-of-the-art Vision Transformer (ViT) based model designed for end-to-end formula recognition. Recently upgraded to version 2.0, it represents a significant advancement in the field of mathematical formula OCR, capable of converting formula images into LaTeX-style text with exceptional accuracy.

Implementation Details

Built using PyTorch and supporting ONNX runtime, TexTeller 2.0 leverages a vision-encoder-decoder architecture trained on an impressive 7.5M sample dataset - approximately 15 times larger than its predecessor. The model utilizes safetensors for efficient parameter storage and offers inference endpoints for practical deployment.

298M parameters for robust formula recognition
Trained on OleehyO/latex-formulas dataset
Supports complex multi-line formulas and matrices
Implements image-to-text transformation pipeline

Core Capabilities

Recognition of formulas in natural images
Superior handling of rare mathematical symbols
Accurate processing of complex matrix structures
Enhanced generalization compared to alternatives
Robust performance on multi-line equations

Frequently Asked Questions

Q: What makes this model unique?

TexTeller 2.0's uniqueness lies in its extensive training dataset of 7.5M samples, which is significantly larger than competitors like LaTeX-OCR (100K samples). This results in superior generalization and higher accuracy, especially for complex mathematical expressions.

Q: What are the recommended use cases?

The model is ideal for academic and professional applications requiring mathematical formula digitization, including digital textbook creation, research paper processing, and educational technology tools. It excels at handling both simple equations and complex mathematical structures.

TexTeller

TexTeller

What is TexTeller?

Implementation Details

Core Capabilities

Frequently Asked Questions

Q: What makes this model unique?

Q: What are the recommended use cases?

Related Models