Llama3-8B-Chinese-Chat

Llama3-8B-Chinese-Chat

shenzhi-wang

Advanced 8B parameter Chinese-English LLM built on Llama3, optimized for bilingual dialogue with enhanced capabilities in roleplay, function calling & math. Trained on 100K preference pairs.

PropertyValue
Parameter Count8.03B
Context Length8K tokens
Base ModelMeta-Llama-3-8B-Instruct
LicenseLlama3 License
Training FrameworkLLaMA-Factory

What is Llama3-8B-Chinese-Chat?

Llama3-8B-Chinese-Chat is an advanced bilingual language model specifically fine-tuned for Chinese and English interactions. Built upon Meta's Llama-3-8B-Instruct model, it has been optimized using ORPO (Reference-free Monolithic Preference Optimization) on approximately 100K preference pairs, making it particularly effective for Chinese-language tasks while maintaining strong English capabilities.

Implementation Details

The model was trained using full parameter fine-tuning with specific hyperparameters including a learning rate of 3e-6, cosine scheduler, and a context length of 8192 tokens. The training process utilized the ORPO methodology with a beta value of 0.05 and a global batch size of 128.

  • Trained using paged_adamw_32bit optimizer
  • 2 epochs of training with 0.1 warmup ratio
  • BF16 precision for optimal performance
  • Implements flash attention for efficient processing

Core Capabilities

  • Advanced bilingual dialogue generation
  • Enhanced roleplay capabilities
  • Sophisticated function calling
  • Improved mathematical reasoning
  • Context-aware responses in both Chinese and English
  • Reduced tendency to mix languages in responses

Frequently Asked Questions

Q: What makes this model unique?

This model represents the first Llama3-based model specifically optimized for Chinese-English bilingual interactions using ORPO methodology. It significantly reduces issues with language mixing and improves upon the base model's capabilities in roleplay, function calling, and mathematical reasoning.

Q: What are the recommended use cases?

The model excels in bilingual conversations, creative writing, mathematical problem-solving, and roleplay scenarios. It's particularly well-suited for applications requiring natural Chinese language generation while maintaining English capabilities.

Related Models

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026