Breeze-7B-Instruct-v0_1
Property | Value |
---|---|
Parameter Count | 7.49B |
Model Type | Causal decoder-only transformer |
License | Apache 2.0 |
Paper | Technical Report |
Context Length | 8k tokens |
What is Breeze-7B-Instruct-v0_1?
Breeze-7B-Instruct-v0_1 is an instruction-tuned language model developed by MediaTek Research, specifically designed to excel at Traditional Chinese language tasks while maintaining strong English capabilities. Built upon Mistral-7B, it features an expanded vocabulary of 62,000 tokens (30,000 more than the base model) to better handle Traditional Chinese characters, resulting in twice the inference speed for Chinese text processing.
Implementation Details
The model utilizes a causal decoder-only transformer architecture and implements several technical innovations to enhance its performance. It supports BF16 precision and can be accelerated using Flash Attention 2 for optimal inference speed.
- Expanded vocabulary dictionary (62k tokens) optimized for Traditional Chinese
- 8k token context window
- Multi-turn dialogue support
- Built on Mistral-7B architecture with significant modifications
Core Capabilities
- Strong performance in both Traditional Chinese and English benchmarks
- Efficient inference with 2x speed for Chinese text compared to base models
- Excels in Q&A, RAG, multi-round chat, and summarization tasks
- Competitive MT-Bench scores (5.7 for Traditional Chinese, 7.1 for English)
- Enhanced performance in TMMLU+ and other Traditional Chinese benchmarks
Frequently Asked Questions
Q: What makes this model unique?
The model's expanded vocabulary and optimization for Traditional Chinese, combined with maintaining strong English capabilities, sets it apart. It achieves faster inference speeds while delivering competitive performance across various benchmarks.
Q: What are the recommended use cases?
The model is particularly well-suited for Traditional Chinese language tasks, including question-answering, chat applications, summarization, and RAG implementations. It's designed for production deployment with efficient inference capabilities.