invoice-and-receipts_donut_v1

Maintained By
mychen76

invoice-and-receipts_donut_v1

PropertyValue
Parameter Count202M
LicenseApache 2.0
Authormychen76
Model TypeVision-encoder-decoder
Downloads1,520

What is invoice-and-receipts_donut_v1?

invoice-and-receipts_donut_v1 is a specialized vision-encoder-decoder model designed to transform invoice and receipt images directly into structured data formats (JSON or XML) without requiring a separate OCR engine. Built on the Donut architecture, this 202M parameter model represents a significant advancement in document processing efficiency.

Implementation Details

The model employs a transformer-based architecture that processes image inputs and generates structured text outputs. It utilizes PyTorch and Safetensors for efficient processing, eliminating the traditional requirement for separate OCR processing steps.

  • Direct image-to-structured-text conversion
  • Supports both JSON and XML output formats
  • Optimized for receipt and invoice processing
  • Implements vision-encoder-decoder architecture

Core Capabilities

  • Extracts header information including invoice numbers, dates, and tax IDs
  • Processes detailed line items with quantities, prices, and descriptions
  • Calculates and validates financial summaries
  • Handles complex document layouts and variations
  • Supports multiple currency formats and tax calculations

Frequently Asked Questions

Q: What makes this model unique?

This model's uniqueness lies in its ability to directly convert image data to structured formats without intermediate OCR processing, reducing computational overhead and simplifying the deployment pipeline.

Q: What are the recommended use cases?

The model is ideal for automated invoice processing systems, expense management solutions, accounting software integration, and any application requiring structured data extraction from invoice or receipt images.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.