Suzume LLaMA-3 8B Multilingual ORPO Borda Half

Property	Value
Base Model	LLaMA-3 8B
Training Method	ORPO (Optimized Ranked Preference Optimization)
License	Non-commercial
Languages	English, Chinese, French, German, Japanese, Russian

What is suzume-llama-3-8B-multilingual-orpo-borda-half?

This model is an advanced multilingual variant of LLaMA-3 8B, fine-tuned using ORPO methodology on the Mitsu dataset. It specifically utilizes the top 50% most consistently ranked responses for training, achieving impressive performance across six languages. The model shows notable improvements over its base version and competes favorably with models like Starling-LM-7B-beta and GPT-3.5-turbo in multilingual capabilities.

Implementation Details

The model was trained using the Axolotl framework with specific optimizations including gradient checkpointing, 8-bit Adam optimizer, and cosine learning rate scheduling. Training utilized a sequence length of 8192 and employed flash attention for improved efficiency.

Learning rate: 8e-6 with cosine scheduling
Gradient accumulation steps: 8
Training epochs: 1
Validation loss: 0.0935
BF16 mixed precision training

Core Capabilities

Strong multilingual performance with MT-Bench scores exceeding 7.5 in most languages
Particularly impressive performance in Russian (8.94) and English (7.98)
Balanced response generation across different linguistic contexts
Efficient handling of context-heavy prompts

Frequently Asked Questions

Q: What makes this model unique?

The model's distinctive feature is its training approach using ORPO on carefully selected high-quality responses, specifically utilizing the 50% most consistently ranked responses from the Mitsu dataset. This selective training strategy has resulted in superior performance across multiple languages.

Q: What are the recommended use cases?

The model is particularly well-suited for multilingual applications requiring high-quality response generation. However, due to its non-commercial license (inherited from Command R/R+ training data), it's limited to research and non-commercial applications.