StructTable-InternVL2-1B

Maintained By
U4R

StructTable-InternVL2-1B

PropertyValue
Parameter Count938M
LicenseApache-2.0
PaperView Paper
LanguagesEnglish, Chinese
Tensor TypeBF16

What is StructTable-InternVL2-1B?

StructTable-InternVL2-1B is a state-of-the-art vision-language model designed specifically for converting table images into various structured formats including LaTeX, HTML, and Markdown. Built on the powerful InternVL2 architecture, this model represents a significant advancement in table understanding and conversion technology.

Implementation Details

The model is built upon a sophisticated architecture combining vision and language processing capabilities, fine-tuned on both the DocGenome dataset and synthetic tabular data. It leverages LMDeploy for efficient inference and supports multiple output formats.

  • Built on InternVL2-1B architecture with 938M parameters
  • Supports efficient inference through LMDeploy toolkit
  • Trained on DocGenome benchmark with over 2 million high-quality Image-LaTeX pairs
  • Implements data augmentation for improved robustness

Core Capabilities

  • Converts table images to LaTeX, HTML, and Markdown formats
  • Handles complex table structures including spanning cells
  • Supports both English and Chinese language processing
  • Provides high-efficiency inference with LMDeploy integration
  • Capable of structural extraction and question answering on tables

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its ability to handle multiple output formats and its training on the comprehensive DocGenome benchmark, covering 156 disciplinary classes. Its integration with LMDeploy makes it particularly efficient for production deployments.

Q: What are the recommended use cases?

The model is ideal for scientific publication processing, financial document analysis, automated document processing systems, and any application requiring the conversion of visual tables into structured formats like LaTeX, HTML, or Markdown.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.