StableLM 2 1.6B

Property	Value
Parameter Count	1.64B parameters
Architecture	Decoder-only Transformer
Training Data	2 trillion tokens
License	Stability AI Community License
Supported Languages	English, German, Spanish, French, Italian, Dutch, Portuguese
Paper	Stable LM 2 1.6B Technical Report

What is stablelm-2-1_6b?

StableLM 2 1.6B is a state-of-the-art decoder-only language model developed by Stability AI. It represents a significant advancement in multilingual language modeling, trained on a diverse dataset of 2 trillion tokens across multiple languages and code. The model features a sophisticated architecture with 24 layers, 32 attention heads, and a hidden size of 2048.

Implementation Details

The model utilizes advanced architectural elements including Rotary Position Embeddings, LayerNorm with learned bias terms, and selective bias implementation in the attention layers. It's optimized for performance with Flash Attention 2 support and employs the Arcade100k tokenizer with a vocabulary size of 100,352.

2048 hidden size with 24 layers and 32 attention heads
4096 sequence length capability
Trained on multiple high-quality datasets including Falcon RefinedWeb and RedPajama-Data-1T
Implements Flash Attention 2 for optimal performance

Core Capabilities

Multilingual text generation across 7 languages
Code generation and processing
Efficient text completion with customizable parameters
Optimized for fine-tuning on downstream tasks

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its efficient architecture combining multilingual capabilities with a relatively compact parameter count. It features optimized attention mechanisms and supports Flash Attention 2, making it both powerful and practical for various applications.

Q: What are the recommended use cases?

The model is primarily intended as a base model for fine-tuning in specific applications. It's particularly suitable for multilingual text generation, code-related tasks, and can be adapted for various downstream applications after appropriate fine-tuning and safety evaluations.

stablelm-2-1_6b