japanese-gpt-neox-3.6b-instruction-sft-v2

japanese-gpt-neox-3.6b-instruction-sft-v2

rinna

Japanese GPT-NeoX model (3.6B parameters) fine-tuned for instruction following. Features specialized tokenization and conversation format. MIT licensed.

PropertyValue
Parameter Count3.6 Billion
Model TypeInstruction-tuned Language Model
Architecture36-layer, 2816-hidden-size transformer
LicenseMIT
AuthorsTianyu Zhao and Kei Sawada

What is japanese-gpt-neox-3.6b-instruction-sft-v2?

This is an advanced Japanese language model based on GPT-NeoX architecture, specifically fine-tuned for instruction-following and conversational tasks. It represents an improvement over its predecessor, utilizing a different data split for training and showing better performance in ChatGPT-based automated evaluations.

Implementation Details

The model employs a sophisticated tokenization system using SentencePiece with a 32,000-token vocabulary. It features specialized handling of Japanese text and unique conversation formatting using a system-user dialogue structure.

  • Custom tokenizer with byte fallback feature to handle unknown characters
  • Specialized conversation format using ユーザー and システム roles
  • Fine-tuned on translated datasets including Anthropic HH RLHF, FLAN, and Stanford Human Preferences
  • Supports advanced generation parameters including temperature and repetition penalty

Core Capabilities

  • Natural Japanese language understanding and generation
  • Instruction-following in conversational contexts
  • Handles complex dialogue interactions
  • Preserves whitespace and special characters accurately
  • 55% win rate against previous version in automated evaluations

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its specialized Japanese language capabilities and improved instruction-following abilities, achieved through careful fine-tuning and a unique tokenization approach that handles Japanese text effectively.

Q: What are the recommended use cases?

The model is particularly well-suited for Japanese conversational AI applications, chatbots, and instruction-following tasks where natural Japanese language interaction is required. It's designed to handle both formal and informal conversation styles.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026