stablelm-2-1_6b

Maintained By
stabilityai

StableLM 2 1.6B

PropertyValue
Parameter Count1.64B parameters
ArchitectureDecoder-only Transformer
Training Data2 trillion tokens
LicenseStability AI Community License
Supported LanguagesEnglish, German, Spanish, French, Italian, Dutch, Portuguese
PaperStable LM 2 1.6B Technical Report

What is stablelm-2-1_6b?

StableLM 2 1.6B is a state-of-the-art decoder-only language model developed by Stability AI. It represents a significant advancement in multilingual language modeling, trained on a diverse dataset of 2 trillion tokens across multiple languages and code. The model features a sophisticated architecture with 24 layers, 32 attention heads, and a hidden size of 2048.

Implementation Details

The model utilizes advanced architectural elements including Rotary Position Embeddings, LayerNorm with learned bias terms, and selective bias implementation in the attention layers. It's optimized for performance with Flash Attention 2 support and employs the Arcade100k tokenizer with a vocabulary size of 100,352.

  • 2048 hidden size with 24 layers and 32 attention heads
  • 4096 sequence length capability
  • Trained on multiple high-quality datasets including Falcon RefinedWeb and RedPajama-Data-1T
  • Implements Flash Attention 2 for optimal performance

Core Capabilities

  • Multilingual text generation across 7 languages
  • Code generation and processing
  • Efficient text completion with customizable parameters
  • Optimized for fine-tuning on downstream tasks

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its efficient architecture combining multilingual capabilities with a relatively compact parameter count. It features optimized attention mechanisms and supports Flash Attention 2, making it both powerful and practical for various applications.

Q: What are the recommended use cases?

The model is primarily intended as a base model for fine-tuning in specific applications. It's particularly suitable for multilingual text generation, code-related tasks, and can be adapted for various downstream applications after appropriate fine-tuning and safety evaluations.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.