SmolDocling-256M-preview-mlx-bf16

Maintained By
ds4sd

SmolDocling-256M-preview-mlx-bf16

PropertyValue
Model Size256M parameters
FrameworkMLX
FormatBF16
Original Authords4sd
Model URLHugging Face

What is SmolDocling-256M-preview-mlx-bf16?

SmolDocling-256M-preview-mlx-bf16 is a specialized document understanding model optimized for Apple's MLX framework. It's designed to convert document images into structured formats using the Docling framework, particularly excelling at handling tables and lists within documents.

Implementation Details

The model is implemented using MLX-VLM version 0.1.18 and requires Python 3.12 or higher. It utilizes the docling-core framework for document processing and supports both local and URL-based image inputs.

  • Optimized for BF16 precision
  • Supports streaming generation
  • Integrated with docling-core for document structure parsing
  • Compatible with PIL for image processing

Core Capabilities

  • Document structure recognition
  • Table and list extraction
  • HTML and Markdown export
  • Embedded image handling
  • Real-time token generation

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its optimization for Apple's MLX framework and its ability to process document images into structured formats while maintaining a relatively small parameter count of 256M.

Q: What are the recommended use cases?

The model is ideal for document parsing tasks, particularly when working with documents containing tables and lists. It's especially useful in environments where MLX optimization is beneficial, such as Apple Silicon hardware.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.