NuExtract-1.5-smol

numind

NuExtract-1.5-smol: A 1.71B parameter multilingual model fine-tuned from SmolLM2, specialized in structured information extraction with MIT license support.

Property	Value
Parameter Count	1.71B
Model Type	Text Generation / Information Extraction
License	MIT
Tensor Type	BF16
Base Model	SmolLM2-1.7B

What is NuExtract-1.5-smol?

NuExtract-1.5-smol is a specialized language model designed for structured information extraction tasks. It's a fine-tuned version of SmolLM2-1.7B that maintains high performance while being more compact than its larger counterparts. This model is particularly notable for its multilingual capabilities and efficient architecture that enables processing of texts in multiple languages.

Implementation Details

The model leverages advanced architecture optimizations while maintaining a relatively small footprint of 1.71B parameters. It's implemented using BF16 tensor type for optimal performance and memory usage, and is designed to work with a JSON template-based extraction approach.

Optimized for zero-shot performance across multiple languages
Uses template-based extraction methodology
Supports arbitrary sequence lengths through sliding window attention
Recommended to use with temperature at or near 0 for optimal extraction

Core Capabilities

Structured information extraction from unstructured text
Multilingual support with strong zero-shot performance
Template-based extraction using JSON schemas
Efficient processing of long sequences
Pure extraction focus with high accuracy in maintaining original text

Frequently Asked Questions

Q: What makes this model unique?

The model combines compactness (1.71B parameters) with powerful extraction capabilities, outperforming larger models in specific tasks while maintaining multilingual support. It's specifically optimized for pure extraction tasks, ensuring that generated content closely matches the source text.

Q: What are the recommended use cases?

The model excels in structured information extraction tasks where precise data needs to be pulled from unstructured text. It's particularly useful for automated data extraction, document processing, and multilingual information retrieval tasks where accuracy and efficiency are crucial.