qwq-bakeneko-32b

qwq-bakeneko-32b

rinna

A 32B parameter Japanese language model fine-tuned using Chat Vector and ORPO, built on Qwen2.5-bakeneko for superior reasoning and instruction-following capabilities

PropertyValue
Parameter Count32 Billion
Architecture64-layer, 5120-hidden-size transformer
LicenseApache License 2.0
Release DateMarch 13, 2025
Authorrinna

What is qwq-bakeneko-32b?

QwQ Bakeneko 32B is an advanced Japanese language model that represents a significant evolution in instruction-tuned reasoning capabilities. Built upon the rinna/qwen2.5-bakeneko-32b foundation, this model incorporates innovative Chat Vector technology and Odds Ratio Preference Optimization (ORPO) to deliver enhanced performance in Japanese language tasks.

Implementation Details

The model employs a sophisticated multi-stage training process that combines model merging and distillation techniques. The base model is enhanced through a Chat Vector addition process, calculated by subtracting parameter vectors of Qwen/QwQ-32B from Qwen/Qwen2.5-32B. Further refinement occurs through ORPO training on 1.3k carefully curated data samples generated by DeepSeek-R1.

  • Advanced 64-layer transformer architecture with 5120 hidden size
  • Innovative Chat Vector technology for improved instruction following
  • ORPO optimization for enhanced reasoning capabilities
  • Comprehensive benchmarking showing superior performance in Japanese tasks

Core Capabilities

  • Superior performance in Japanese language tasks with 78.31 score on Japanese LM Evaluation Harness
  • Enhanced multi-turn conversation abilities (8.52 on Japanese MT-Bench)
  • Improved instruction following through Chat Vector technology
  • Advanced reasoning capabilities through DeepSeek-R1 distillation

Frequently Asked Questions

Q: What makes this model unique?

The model combines Chat Vector technology with ORPO optimization, creating a unique approach to Japanese language processing that outperforms previous models in both single-turn and multi-turn conversations. Its architecture specifically targets Japanese language understanding while maintaining strong reasoning capabilities.

Q: What are the recommended use cases?

The model excels in Japanese language tasks, particularly in scenarios requiring complex reasoning and multi-turn conversations. It's well-suited for applications requiring sophisticated Japanese language understanding, instruction following, and detailed reasoning capabilities.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026