SeaLLM-7B-v2
Property | Value |
---|---|
Parameter Count | 7.38B |
Model Type | Multilingual LLM |
Architecture | Based on Mistral-7B |
License | SeaLLMs License |
Paper | arXiv:2312.00738 |
What is SeaLLM-7B-v2?
SeaLLM-7B-v2 is a state-of-the-art multilingual language model specifically designed for Southeast Asian languages. Built upon Mistral-7B architecture, it supports 10 languages including English, Chinese, Vietnamese, Indonesian, Thai, Malay, Khmer, Lao, Myanmar, and Filipino. The model represents a significant advancement in multilingual AI, achieving remarkable performance across various tasks while maintaining a relatively compact size.
Implementation Details
The model utilizes BF16 precision and implements a carefully designed continuation pre-training approach from Mistral-7B. It features a specialized chat template format and supports various deployment options including vLLM, transformers, and local installations through LM-studio or ollama.
- Achieves 78.2 score on zero-shot CoT GSM8K (state-of-the-art for 7B models)
- Scores 7.54 on MT-bench, ranking 3rd in the 7B category
- Supports efficient inference with multiple deployment options
Core Capabilities
- Superior mathematical reasoning across multiple languages
- Strong performance in zero-shot commonsense reasoning
- Competitive multilingual world knowledge understanding
- Efficient instruction following in 10 Southeast Asian languages
- Outperforms GPT-3.5 in various translated benchmarks
Frequently Asked Questions
Q: What makes this model unique?
SeaLLM-7B-v2 stands out for its exceptional performance in Southeast Asian languages while maintaining strong capabilities in mathematical reasoning and commonsense tasks. It achieves this with only 7B parameters, making it both efficient and practical for deployment.
Q: What are the recommended use cases?
The model excels in multilingual conversations, mathematical problem-solving, general knowledge tasks, and instruction following across Southeast Asian languages. It's particularly suitable for applications requiring strong reasoning capabilities in multiple languages.