Bielik-7B-v0.1

Maintained By
speakleash

Bielik-7B-v0.1

PropertyValue
Parameter Count7.24B
LicenseApache 2.0
LanguagePolish
Paperarxiv:2410.18565
Base ModelMistral-7B-v0.1

What is Bielik-7B-v0.1?

Bielik-7B-v0.1 is a sophisticated Polish language model developed through collaboration between SpeakLeash and ACK Cyfronet AGH. Built upon Mistral-7B-v0.1, it has been trained on over 70 billion tokens of carefully curated Polish text, making it particularly adept at understanding and processing the Polish language.

Implementation Details

The model was trained using the ALLaMo framework on the Helios Supercomputer, utilizing 256 NVidia GH200 cards with impressive throughput exceeding 9200 tokens/gpu/second. The training process involved 36 billion tokens over two epochs, with mixed precision (bfloat16) training.

  • Context length: 4096 tokens
  • Training batch size: 4194304
  • Learning rate: 3e-05 -> 2e-05 (cosine schedule)
  • Advanced quality evaluation using XGBoost classification

Core Capabilities

  • State-of-the-art performance in RAG Reader tasks (88.39% score)
  • Excellent perplexity metrics (123.31)
  • Robust text generation and understanding in Polish
  • Optimized for further fine-tuning across various use cases

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its specialized optimization for Polish language processing, achieved through extensive training on high-quality Polish text corpora and validated through sophisticated quality evaluation methods.

Q: What are the recommended use cases?

As a base model, Bielik-7B-v0.1 is designed for further fine-tuning across various applications. For direct chatting or instruction-following, users should consider the Bielik-7B-Instruct-v0.1 variant instead.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.