StructTable-InternVL2-1B

Maintained By
U4R

StructTable-InternVL2-1B

PropertyValue
Parameter Count938M
Model TypeImage-to-Text
LicenseApache-2.0
PaperDocGenome Paper
LanguagesEnglish, Chinese

What is StructTable-InternVL2-1B?

StructTable-InternVL2-1B is a state-of-the-art model designed to transform table images into various structured formats including LaTeX, HTML, and Markdown. Built on the powerful InternVL2 architecture, this model represents a significant advancement in table understanding and conversion technology.

Implementation Details

The model is implemented with 938M parameters and utilizes BF16 tensor types for efficient processing. It's fine-tuned on the DocGenome dataset and synthetic tabular data, incorporating advanced data augmentation techniques to enhance robustness and performance.

  • Efficient inference through LMDeploy integration
  • Support for multiple output formats (LaTeX, HTML, Markdown)
  • Comprehensive training on over 2 million high-quality Image-LaTeX pairs
  • Coverage of 156 disciplinary classes

Core Capabilities

  • Table image to LaTeX/HTML/Markdown conversion
  • Structural extraction and analysis
  • Multi-lingual support (English and Chinese)
  • High-efficiency processing with optimized inference
  • Handling complex table structures including spanning cells

Frequently Asked Questions

Q: What makes this model unique?

The model's ability to handle multiple output formats, combined with its efficient processing capabilities and comprehensive training on the DocGenome benchmark, makes it particularly valuable for document processing applications. Its support for both English and Chinese also sets it apart from similar solutions.

Q: What are the recommended use cases?

The model is ideal for scientific publication processing, financial document analysis, automated document processing systems, and any application requiring accurate table extraction and conversion to structured formats. It's particularly useful in academic and professional environments where precise table formatting is crucial.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.