SmolLM2-135M-Instruct

HuggingFaceTB

SmolLM2-135M-Instruct: Compact 135M parameter language model optimized for instruction following, trained on 2T tokens with DPO and SFT.

Property	Value
Parameter Count	135M
Training Tokens	2 trillion
License	Apache 2.0
Architecture	Transformer decoder
Precision	BFloat16

What is SmolLM2-135M-Instruct?

SmolLM2-135M-Instruct is a compact yet powerful language model designed for efficient instruction following and general text generation. As part of the SmolLM2 family, it represents a significant advancement over its predecessor, particularly excelling in instruction following, knowledge application, and reasoning capabilities. The model was trained on an extensive and diverse dataset of 2 trillion tokens, including FineWeb-Edu, DCLM, and The Stack.

Implementation Details

The model underwent a sophisticated training process involving supervised fine-tuning (SFT) and Direct Preference Optimization (DPO) using the UltraFeedback dataset. It was trained using 64 H100 GPUs and the nanotron framework, demonstrating impressive performance metrics across various benchmarks.

Zero-shot performance improvements over predecessor in multiple benchmarks
Supports text rewriting and summarization tasks
Optimized for efficient on-device deployment
Implements chat template for conversational applications

Core Capabilities

Instruction following with 29.9% average performance on IFEval
Strong performance on reasoning tasks (28.2% on BBH 3-shot)
Efficient text generation and summarization
Lightweight deployment options with ONNX and Transformers.js support

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its exceptional performance-to-size ratio, delivering strong capabilities in instruction following and reasoning tasks despite its compact 135M parameter size. It's specifically optimized for on-device deployment while maintaining competitive performance metrics.

Q: What are the recommended use cases?

The model is well-suited for text generation, summarization, and instruction-following tasks. It's particularly valuable for applications requiring efficient on-device deployment or where computational resources are limited, while still needing reliable language model capabilities.