Nougat-Small

Property	Value
Parameter Count	247M
License	CC-BY-4.0
Paper	View Paper
Author	Facebook Research

What is nougat-small?

Nougat-small is a compact version of Facebook's Neural Optical Understanding for Academic Documents model, specifically designed to transform academic PDFs into structured markdown format. As a lightweight variant with 247M parameters, it maintains efficient performance while requiring fewer computational resources than its larger counterparts.

Implementation Details

The model implements a sophisticated architecture combining a Swin Transformer for visual encoding with an mBART model for text decoding. This dual-architecture approach enables the model to process PDF documents at the pixel level and generate corresponding markdown output.

Vision Encoder: Swin Transformer for processing document images
Text Decoder: mBART-based architecture for markdown generation
Autoregressive prediction capability
Optimized for academic document understanding

Core Capabilities

PDF-to-markdown conversion
Academic document processing
Visual layout understanding
Structured text generation
Support for scientific content parsing

Frequently Asked Questions

Q: What makes this model unique?

Nougat-small stands out for its efficient architecture that maintains high-quality document understanding capabilities while using only 247M parameters. It's specifically optimized for academic content and can process complex scientific documents effectively.

Q: What are the recommended use cases?

The model is ideal for automated processing of academic papers, research documents, and scientific PDFs where conversion to markdown format is needed. It's particularly useful for creating searchable, structured content from PDF documents in academic and research settings.

nougat-small