RakutenAI-7B-instruct

Maintained By
Rakuten

RakutenAI-7B-instruct

PropertyValue
Parameters7B
ArchitectureMistral-based
LicenseApache License 2.0
LanguagesJapanese, English
PaperarXiv:2403.15484

What is RakutenAI-7B-instruct?

RakutenAI-7B-instruct is a state-of-the-art bilingual language model that excels in both Japanese and English language understanding. Built on the Mistral architecture, it features an expanded vocabulary of 48k tokens (increased from 32k) to better handle Japanese character encoding. The model achieves the highest scores on Japanese language benchmarks while maintaining competitive performance on English tasks.

Implementation Details

The model is built upon Mistral-7B-v0.1 pre-trained checkpoint and has been fine-tuned using a diverse mix of datasets including JSNLI, RTE, KUCI, BELEBELE, JCS, JNLI, Dolly-15K, and OpenAssistant1. The instruction-tuning process has been carefully designed to maintain strong performance across both Japanese and English tasks.

  • Extended vocabulary optimization for Japanese text
  • Comprehensive instruction-tuning on diverse datasets
  • Achieved highest average score on Japanese LM-Harness metrics
  • Superior performance on English benchmarks compared to similar models

Core Capabilities

  • 93.03% accuracy on JCS benchmark
  • 90.39% accuracy on JNLI tasks
  • 96.00% accuracy on MARC-ja
  • Strong performance on English tasks including ARC (58.62%) and HellaSwag (82.70%)
  • Balanced bilingual capabilities with state-of-the-art Japanese performance

Frequently Asked Questions

Q: What makes this model unique?

RakutenAI-7B-instruct stands out for its exceptional performance on Japanese language tasks while maintaining strong English capabilities. It achieves this through an expanded vocabulary and carefully curated instruction-tuning process, making it particularly effective for bilingual applications.

Q: What are the recommended use cases?

The model is well-suited for bilingual applications requiring strong Japanese and English language understanding, including text generation, question answering, and general language understanding tasks. It's particularly effective for Japanese language processing while maintaining competitive English language capabilities.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.