QwQ Bakeneko 32B

Property	Value
Parameter Count	32 Billion
Architecture	64-layer, 5120-hidden-size transformer
License	Apache License 2.0
Release Date	March 13, 2025
Author	rinna

What is qwq-bakeneko-32b?

QwQ Bakeneko 32B is an advanced Japanese language model that represents a significant evolution in instruction-tuned reasoning capabilities. Built upon the rinna/qwen2.5-bakeneko-32b foundation, this model incorporates innovative Chat Vector technology and Odds Ratio Preference Optimization (ORPO) to deliver enhanced performance in Japanese language tasks.

Implementation Details

The model employs a sophisticated multi-stage training process that combines model merging and distillation techniques. The base model is enhanced through a Chat Vector addition process, calculated by subtracting parameter vectors of Qwen/QwQ-32B from Qwen/Qwen2.5-32B. Further refinement occurs through ORPO training on 1.3k carefully curated data samples generated by DeepSeek-R1.

Advanced 64-layer transformer architecture with 5120 hidden size
Innovative Chat Vector technology for improved instruction following
ORPO optimization for enhanced reasoning capabilities
Comprehensive benchmarking showing superior performance in Japanese tasks

Core Capabilities

Superior performance in Japanese language tasks with 78.31 score on Japanese LM Evaluation Harness
Enhanced multi-turn conversation abilities (8.52 on Japanese MT-Bench)
Improved instruction following through Chat Vector technology
Advanced reasoning capabilities through DeepSeek-R1 distillation

Frequently Asked Questions

Q: What makes this model unique?

The model combines Chat Vector technology with ORPO optimization, creating a unique approach to Japanese language processing that outperforms previous models in both single-turn and multi-turn conversations. Its architecture specifically targets Japanese language understanding while maintaining strong reasoning capabilities.

Q: What are the recommended use cases?

The model excels in Japanese language tasks, particularly in scenarios requiring complex reasoning and multi-turn conversations. It's well-suited for applications requiring sophisticated Japanese language understanding, instruction following, and detailed reasoning capabilities.

qwq-bakeneko-32b