Baichuan-7B-sft

Baichuan-7B-sft

hiyouga

Bilingual (Chinese/English) instruction-tuned 7B parameter LLM using LoRA fine-tuning on Alpaca datasets. Supports text generation with detailed responses.

PropertyValue
LicenseApache 2.0
LanguagesChinese, English
Training FrameworkLLaMA-Factory
Base ModelBaichuan-7B

What is Baichuan-7B-sft?

Baichuan-7B-sft is a bilingual instruction-tuned language model built on the Baichuan-7B architecture. It's fine-tuned using LoRA (Low-Rank Adaptation) on multiple instruction datasets including Alpaca, Alpaca-zh, and CodeAlpaca, making it particularly effective for both Chinese and English text generation tasks.

Implementation Details

The model utilizes the Transformers library and implements LoRA fine-tuning with a rank of 16, targeting all layers. Training was conducted using a cosine learning rate scheduler with a 5e-5 learning rate over 2 epochs, employing mixed precision (FP16) training for efficiency.

  • Uses PyTorch backend with Transformers library
  • Implements LoRA fine-tuning methodology
  • Trained on multiple instruction datasets
  • Supports text streaming during generation

Core Capabilities

  • Bilingual instruction following (Chinese and English)
  • Code-related task handling through CodeAlpaca training
  • Streaming text generation
  • Interactive chat-style responses

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its bilingual capabilities and efficient fine-tuning approach using LoRA, making it particularly suitable for both Chinese and English language tasks while maintaining a relatively small deployment footprint.

Q: What are the recommended use cases?

The model is well-suited for chatbot applications, instruction following tasks, code-related queries, and bilingual text generation scenarios. It's particularly effective when deployed in applications requiring both Chinese and English language understanding.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026