StructTable-InternVL2-1B

StructTable-InternVL2-1B

U4R

A powerful 938M parameter model for converting table images to LaTeX/HTML/Markdown, supporting both English and Chinese with high-efficiency processing capabilities.

PropertyValue
Parameter Count938M
Model TypeImage-to-Text
LicenseApache-2.0
PaperDocGenome Paper
LanguagesEnglish, Chinese

What is StructTable-InternVL2-1B?

StructTable-InternVL2-1B is a state-of-the-art model designed to transform table images into various structured formats including LaTeX, HTML, and Markdown. Built on the powerful InternVL2 architecture, this model represents a significant advancement in table understanding and conversion technology.

Implementation Details

The model is implemented with 938M parameters and utilizes BF16 tensor types for efficient processing. It's fine-tuned on the DocGenome dataset and synthetic tabular data, incorporating advanced data augmentation techniques to enhance robustness and performance.

  • Efficient inference through LMDeploy integration
  • Support for multiple output formats (LaTeX, HTML, Markdown)
  • Comprehensive training on over 2 million high-quality Image-LaTeX pairs
  • Coverage of 156 disciplinary classes

Core Capabilities

  • Table image to LaTeX/HTML/Markdown conversion
  • Structural extraction and analysis
  • Multi-lingual support (English and Chinese)
  • High-efficiency processing with optimized inference
  • Handling complex table structures including spanning cells

Frequently Asked Questions

Q: What makes this model unique?

The model's ability to handle multiple output formats, combined with its efficient processing capabilities and comprehensive training on the DocGenome benchmark, makes it particularly valuable for document processing applications. Its support for both English and Chinese also sets it apart from similar solutions.

Q: What are the recommended use cases?

The model is ideal for scientific publication processing, financial document analysis, automated document processing systems, and any application requiring accurate table extraction and conversion to structured formats. It's particularly useful in academic and professional environments where precise table formatting is crucial.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026