SmolLM2-360M-Instruct

SmolLM2-360M-Instruct

HuggingFaceTB

SmolLM2-360M-Instruct is a compact 362M parameter language model optimized for instruction following, trained on 4T tokens with improved reasoning capabilities.

PropertyValue
Parameter Count362M
Training Tokens4 trillion
LicenseApache 2.0
ArchitectureTransformer decoder
PrecisionBFloat16

What is SmolLM2-360M-Instruct?

SmolLM2-360M-Instruct is part of the SmolLM2 family of compact language models, specifically designed for efficient instruction following while maintaining a small footprint. This model represents a significant advancement over its predecessor, demonstrating enhanced capabilities in knowledge processing, reasoning, and instruction following.

Implementation Details

The model was trained on a diverse dataset combination including FineWeb-Edu, DCLM, and The Stack. It underwent supervised fine-tuning (SFT) using both public and curated datasets, followed by Direct Preference Optimization (DPO) using UltraFeedback. The training process utilized 64 H100 GPUs and was implemented using the nanotron framework.

  • Zero-shot performance superior to comparable models in multiple benchmarks
  • Achieves 41.0% on IFEval for instruction following
  • Supports text rewriting and summarization tasks
  • Optimized for both GPU and CPU deployment

Core Capabilities

  • Strong performance in HellaSwag (52.1%) and PIQA (70.8%)
  • Enhanced reasoning capabilities with 43.7% accuracy on ARC
  • Efficient instruction following and task completion
  • Lightweight enough for on-device deployment

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for achieving impressive performance metrics despite its compact size of 362M parameters, making it suitable for deployment in resource-constrained environments while maintaining strong capabilities in instruction following and reasoning tasks.

Q: What are the recommended use cases?

The model is particularly well-suited for text generation, summarization, and instruction-following tasks. It's ideal for applications requiring efficient on-device deployment while maintaining robust performance in natural language understanding and generation.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026