SUS-Chat-34B
Property | Value |
---|---|
Model Size | 34B parameters |
Context Window | 8K tokens |
License | Apache 2.0 |
Framework | PyTorch |
What is SUS-Chat-34B?
SUS-Chat-34B is a state-of-the-art bilingual language model jointly developed by Southern University of Science and Technology and IDEA-CCNL. Built upon the Yi-34B architecture, this model has been fine-tuned on 1.4 billion tokens of high-quality instruction data, making it particularly powerful for both Chinese and English language tasks.
Implementation Details
The model leverages advanced transformer architecture with inter-instruction attention sharing, effectively doubling the context window from 4K to 8K tokens. It's implemented using PyTorch and supports standard LLaMA ecosystem compatibility.
- Extended context window of 8K tokens for improved long-text processing
- Comprehensive instruction tuning with 1.4B tokens of high-quality data
- State-of-the-art performance in benchmarks like MMLU, CMMLU, and C-Eval
- Optimized for multi-turn dialogues and complex reasoning tasks
Core Capabilities
- Exceptional performance in Chinese language tasks (82.42% on C-Eval)
- Strong mathematical and reasoning capabilities (80.06% on GSM8K)
- Advanced dialogue handling with extended context understanding
- Robust performance across various benchmarks, surpassing many larger models
Frequently Asked Questions
Q: What makes this model unique?
SUS-Chat-34B stands out for its exceptional bilingual capabilities and comprehensive instruction tuning, achieving state-of-the-art performance without increasing model parameters. It particularly excels in Chinese language tasks while maintaining strong English capabilities.
Q: What are the recommended use cases?
The model is ideal for bilingual applications, complex reasoning tasks, multi-turn dialogues, and general language understanding tasks. It's particularly well-suited for academic and commercial applications requiring strong performance in both Chinese and English.