qwq-bakeneko-32b

Maintained By
rinna

QwQ Bakeneko 32B

PropertyValue
Parameter Count32 Billion
Architecture64-layer, 5120-hidden-size transformer
LicenseApache License 2.0
Release DateMarch 13, 2025
Authorrinna

What is qwq-bakeneko-32b?

QwQ Bakeneko 32B is an advanced Japanese language model that represents a significant evolution in instruction-tuned reasoning capabilities. Built upon the rinna/qwen2.5-bakeneko-32b foundation, this model incorporates innovative Chat Vector technology and Odds Ratio Preference Optimization (ORPO) to deliver enhanced performance in Japanese language tasks.

Implementation Details

The model employs a sophisticated multi-stage training process that combines model merging and distillation techniques. The base model is enhanced through a Chat Vector addition process, calculated by subtracting parameter vectors of Qwen/QwQ-32B from Qwen/Qwen2.5-32B. Further refinement occurs through ORPO training on 1.3k carefully curated data samples generated by DeepSeek-R1.

  • Advanced 64-layer transformer architecture with 5120 hidden size
  • Innovative Chat Vector technology for improved instruction following
  • ORPO optimization for enhanced reasoning capabilities
  • Comprehensive benchmarking showing superior performance in Japanese tasks

Core Capabilities

  • Superior performance in Japanese language tasks with 78.31 score on Japanese LM Evaluation Harness
  • Enhanced multi-turn conversation abilities (8.52 on Japanese MT-Bench)
  • Improved instruction following through Chat Vector technology
  • Advanced reasoning capabilities through DeepSeek-R1 distillation

Frequently Asked Questions

Q: What makes this model unique?

The model combines Chat Vector technology with ORPO optimization, creating a unique approach to Japanese language processing that outperforms previous models in both single-turn and multi-turn conversations. Its architecture specifically targets Japanese language understanding while maintaining strong reasoning capabilities.

Q: What are the recommended use cases?

The model excels in Japanese language tasks, particularly in scenarios requiring complex reasoning and multi-turn conversations. It's well-suited for applications requiring sophisticated Japanese language understanding, instruction following, and detailed reasoning capabilities.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.