Llama3-70B-Chinese-Chat

Maintained By
shenzhi-wang

Llama3-70B-Chinese-Chat

PropertyValue
Parameter Count70.6B
Model TypeLanguage Model
ArchitectureLlama3
LicenseLlama3 License
Training FrameworkLLaMA-Factory
PaperORPO Paper

What is Llama3-70B-Chinese-Chat?

Llama3-70B-Chinese-Chat is a sophisticated large language model specifically fine-tuned for Chinese and English language capabilities. Built upon Meta's Llama-3-70B-Instruct model, it has been trained on over 100,000 preference pairs using the ORPO (Reference-free Monolithic Preference Optimization) algorithm. The model demonstrates exceptional performance in Chinese language tasks, matching GPT-4's capabilities on benchmarks like C-Eval and CMMLU.

Implementation Details

The model was trained using full-parameter fine-tuning with specific hyperparameters including a learning rate of 1.5e-6, cosine learning rate scheduler, and a context length of 8192 tokens. The training process utilized the LLaMA-Factory framework with a global batch size of 128 and the paged_adamw_32bit optimizer.

  • Training epochs: 3
  • Warmup ratio: 0.1
  • ORPO beta: 0.05
  • Context length: 8192 tokens

Core Capabilities

  • Bilingual proficiency in Chinese and English
  • Advanced roleplaying abilities
  • Strong mathematical reasoning
  • Function calling capabilities
  • Matches GPT-4 performance on Chinese benchmarks
  • Achieves 66.1% on C-Eval and 70.28% on CMMLU

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its exceptional Chinese language capabilities, matching GPT-4's performance on major Chinese benchmarks while maintaining strong English language abilities. It's one of the first LLMs specifically fine-tuned for Chinese and English users with various advanced capabilities.

Q: What are the recommended use cases?

The model excels in bilingual conversations, roleplay scenarios, mathematical problem-solving, and function calling tasks. It's particularly well-suited for applications requiring strong Chinese language understanding and generation capabilities.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.