SUS-Chat-34B

Property	Value
Model Size	34B parameters
Context Window	8K tokens
License	Apache 2.0
Framework	PyTorch

What is SUS-Chat-34B?

SUS-Chat-34B is a state-of-the-art bilingual language model jointly developed by Southern University of Science and Technology and IDEA-CCNL. Built upon the Yi-34B architecture, this model has been fine-tuned on 1.4 billion tokens of high-quality instruction data, making it particularly powerful for both Chinese and English language tasks.

Implementation Details

The model leverages advanced transformer architecture with inter-instruction attention sharing, effectively doubling the context window from 4K to 8K tokens. It's implemented using PyTorch and supports standard LLaMA ecosystem compatibility.

Extended context window of 8K tokens for improved long-text processing
Comprehensive instruction tuning with 1.4B tokens of high-quality data
State-of-the-art performance in benchmarks like MMLU, CMMLU, and C-Eval
Optimized for multi-turn dialogues and complex reasoning tasks

Core Capabilities

Exceptional performance in Chinese language tasks (82.42% on C-Eval)
Strong mathematical and reasoning capabilities (80.06% on GSM8K)
Advanced dialogue handling with extended context understanding
Robust performance across various benchmarks, surpassing many larger models

Frequently Asked Questions

Q: What makes this model unique?

SUS-Chat-34B stands out for its exceptional bilingual capabilities and comprehensive instruction tuning, achieving state-of-the-art performance without increasing model parameters. It particularly excels in Chinese language tasks while maintaining strong English capabilities.

Q: What are the recommended use cases?

The model is ideal for bilingual applications, complex reasoning tasks, multi-turn dialogues, and general language understanding tasks. It's particularly well-suited for academic and commercial applications requiring strong performance in both Chinese and English.

SUS-Chat-34B

SUS-Chat-34B

What is SUS-Chat-34B?

Implementation Details

Core Capabilities

Frequently Asked Questions

Q: What makes this model unique?

Q: What are the recommended use cases?

Related Models