Breeze-7B-Instruct-v1_0

Maintained By
MediaTek-Research

Breeze-7B-Instruct-v1_0

PropertyValue
Parameter Count7.49B
LicenseApache 2.0
PaperarXiv:2403.02712
Context Length8K tokens
LanguagesEnglish, Traditional Chinese

What is Breeze-7B-Instruct-v1_0?

Breeze-7B-Instruct-v1_0 is an advanced language model developed by MediaTek Research, specifically designed to excel in both Traditional Chinese and English language tasks. Built upon the Mistral-7B architecture, it features an expanded vocabulary of 62,000 tokens (30,000 more than the original) to better support Traditional Chinese processing.

Implementation Details

The model is implemented as a causal decoder-only transformer, fine-tuned from Breeze-7B-Base. It utilizes BF16 precision and incorporates advanced features for efficient processing in multilingual contexts.

  • Expanded vocabulary (62k tokens) optimized for Traditional Chinese
  • 8,000 token context window
  • Multi-turn dialogue capability
  • 2x faster inference speed for Traditional Chinese compared to base Mistral-7B

Core Capabilities

  • Strong performance in MT-Bench-tw with a score of 6.0
  • Competitive MMLU accuracy of 61.73%
  • Excellent Traditional Chinese reasoning and knowledge tasks
  • Efficient processing of long-form content
  • Support for Q&A, RAG, multi-round chat, and summarization

Frequently Asked Questions

Q: What makes this model unique?

The model's expanded vocabulary and optimization for Traditional Chinese processing, combined with its competitive performance against larger models like GPT-3.5-Turbo in specific tasks, makes it particularly valuable for Traditional Chinese applications.

Q: What are the recommended use cases?

The model excels in Traditional Chinese text generation, Q&A systems, multi-turn conversations, and content summarization. It's particularly suitable for applications requiring both English and Traditional Chinese language capabilities.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.