NuExtract-1.5-smol

Maintained By
numind

NuExtract-1.5-smol

PropertyValue
Parameter Count1.71B
Model TypeText Generation
LicenseMIT
Tensor TypeBF16

What is NuExtract-1.5-smol?

NuExtract-1.5-smol is a specialized language model fine-tuned from SmolLM2-1.7B, designed specifically for structured information extraction tasks. This compact model offers multilingual capabilities while maintaining high performance, despite being less than half the size of its larger counterpart NuExtract-1.5 (1.7B vs 3.8B parameters).

Implementation Details

The model is implemented using the transformers library and operates with bfloat16 precision for efficient inference. It requires a specific input format consisting of a JSON template and input text, making it highly structured and predictable in its extraction capabilities.

  • Built on SmolLM2-1.7B architecture
  • Optimized for zero-shot performance across multiple languages
  • Implements efficient inference with bfloat16 precision
  • Supports sliding window attention for handling long sequences

Core Capabilities

  • Structured information extraction from text
  • Multilingual support with strong zero-shot performance
  • Pure extraction focus with high accuracy
  • Handling of arbitrary sequence lengths
  • Template-based extraction with JSON output

Frequently Asked Questions

Q: What makes this model unique?

The model's ability to perform structured information extraction while maintaining a relatively small parameter count (1.71B) sets it apart. It's specifically designed to prioritize pure extraction, meaning the generated text is typically present verbatim in the source material.

Q: What are the recommended use cases?

The model excels at extracting structured information from text using JSON templates. It's particularly useful for automated data extraction, document parsing, and information retrieval tasks across multiple languages. For optimal performance, it's recommended to use the model with a temperature setting at or very close to 0.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.