llama-3-8b-gpt-4o-ru1.0-gguf

Maintained By
ruslandev

llama-3-8b-gpt-4o-ru1.0-gguf

PropertyValue
Parameter Count8.03B
LicenseLLaMA3
Base Modelmeta-llama/Meta-Llama-3-8B-Instruct
Downloads73,877

What is llama-3-8b-gpt-4o-ru1.0-gguf?

This is a specialized version of LLaMA-3 8B model, fine-tuned specifically for enhanced Russian language capabilities. The model was trained on a carefully curated dataset derived from tagengo-gpt4, with 80% of training examples focused on Russian language content. Its performance matches or exceeds GPT-3.5-turbo in Russian language tasks, achieving an impressive MT-Bench score of 8.12 for Russian and 8.01 for English.

Implementation Details

The model was trained using the Axolotl framework on 2 NVIDIA A100 GPUs for 1 epoch. It implements advanced features like gradient checkpointing and flash attention, using a cosine learning rate scheduler with a 1e-5 learning rate.

  • Sample packing enabled for efficient training
  • 8192 sequence length
  • Trained using DeepSpeed ZeRO-2 optimization
  • Uses 8-bit AdamW optimizer

Core Capabilities

  • Superior Russian language understanding and generation
  • Competitive performance in both Russian (8.12) and English (8.01) on MT-Bench
  • Optimized for GGUF format for efficient deployment
  • Compatible with llama.cpp for local execution

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its specialized Russian language capabilities while maintaining strong English performance, achieved through focused training on high-quality GPT-4o generated data. It matches the performance of larger models trained on 8x bigger datasets.

Q: What are the recommended use cases?

The model is particularly well-suited for Russian language tasks, multilingual applications, and scenarios requiring efficient local deployment through GGUF format. It can be easily used with llama.cpp or the gptchain framework for chat-based applications.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.