helium-1-preview-2b

Maintained By
kyutai

Helium-1-preview-2b

PropertyValue
Parameter Count2 billion
Model TypeLarge Language Model
LanguagesEnglish, French, German, Italian, Portuguese, Spanish
LicenseCC-BY 4.0
Context Length4096 tokens
Model URLhttps://huggingface.co/kyutai/helium-1-preview-2b

What is helium-1-preview-2b?

Helium-1-preview-2b is a lightweight multilingual language model designed specifically for edge and mobile devices. Developed by Kyutai, this 2B parameter model represents a significant achievement in creating efficient, performant language models that can run on resource-constrained environments while maintaining impressive capabilities across six European languages.

Implementation Details

The model features a 24-layer architecture with 20 attention heads and a model dimension of 2560. It was trained on a diverse dataset including Wikipedia, Stack Exchange, open-access scientific articles, and Common Crawl using JAX on 128 NVIDIA H100 GPUs. The model employs a context window of 4096 tokens and uses a theta RoPE value of 100,000.

  • 24 transformer layers with 20 attention heads
  • 2560 model dimension and 7040 MLP dimension
  • Trained on high-quality multilingual datasets
  • Optimized for edge deployment

Core Capabilities

  • Strong performance across multiple languages with 60.7% average accuracy on English benchmarks
  • Competitive results on MMLU, TriviaQA, and other standard benchmarks
  • Efficient multilingual processing with support for 6 European languages
  • Designed for edge deployment and mobile applications

Frequently Asked Questions

Q: What makes this model unique?

Helium-1-preview-2b stands out for its efficient architecture optimized for edge devices while maintaining strong multilingual capabilities. It achieves competitive performance with larger models while using only 2B parameters.

Q: What are the recommended use cases?

The model is best suited for research and development in natural language processing, particularly in resource-constrained environments. However, it should be noted that as a base model, it requires additional fine-tuning or alignment for specific downstream applications.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.