Bielik-7B-v0.1

Property	Value
Parameter Count	7.24B
License	Apache 2.0
Language	Polish
Paper	arxiv:2410.18565
Base Model	Mistral-7B-v0.1

What is Bielik-7B-v0.1?

Bielik-7B-v0.1 is a sophisticated Polish language model developed through collaboration between SpeakLeash and ACK Cyfronet AGH. Built upon Mistral-7B-v0.1, it has been trained on over 70 billion tokens of carefully curated Polish text, making it particularly adept at understanding and processing the Polish language.

Implementation Details

The model was trained using the ALLaMo framework on the Helios Supercomputer, utilizing 256 NVidia GH200 cards with impressive throughput exceeding 9200 tokens/gpu/second. The training process involved 36 billion tokens over two epochs, with mixed precision (bfloat16) training.

Context length: 4096 tokens
Training batch size: 4194304
Learning rate: 3e-05 -> 2e-05 (cosine schedule)
Advanced quality evaluation using XGBoost classification

Core Capabilities

State-of-the-art performance in RAG Reader tasks (88.39% score)
Excellent perplexity metrics (123.31)
Robust text generation and understanding in Polish
Optimized for further fine-tuning across various use cases

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its specialized optimization for Polish language processing, achieved through extensive training on high-quality Polish text corpora and validated through sophisticated quality evaluation methods.

Q: What are the recommended use cases?

As a base model, Bielik-7B-v0.1 is designed for further fine-tuning across various applications. For direct chatting or instruction-following, users should consider the Bielik-7B-Instruct-v0.1 variant instead.

Bielik-7B-v0.1

Bielik-7B-v0.1

What is Bielik-7B-v0.1?

Implementation Details

Core Capabilities

Frequently Asked Questions

Q: What makes this model unique?

Q: What are the recommended use cases?

Related Models