japanese-gpt2-medium

japanese-gpt2-medium

rinna

Japanese GPT-2 medium-sized language model (361M params) trained on CC-100 and Wikipedia, optimized for Japanese text generation and language modeling.

PropertyValue
Parameter Count361M
LicenseMIT
Research PaperView Paper
Training DataJapanese CC-100 and Wikipedia
Architecture24-layer, 1024-hidden-size transformer

What is japanese-gpt2-medium?

Japanese-gpt2-medium is a sophisticated language model developed by rinna Co., Ltd., specifically designed for Japanese text generation. This medium-sized model represents a significant advancement in Japanese natural language processing, featuring 361 million parameters and trained on extensive Japanese text corpora.

Implementation Details

The model utilizes a transformer-based architecture with 24 layers and 1024 hidden dimensions. Training was conducted on 8 V100 GPUs for approximately 30 days, achieving an impressive perplexity score of around 18 on the validation set. The model employs a sentencepiece-based tokenizer trained specifically on Japanese Wikipedia.

  • Transformer-based architecture with 24 layers
  • 1024-dimensional hidden states
  • Trained on Japanese CC-100 and Wikipedia datasets
  • Optimized for Japanese language understanding and generation

Core Capabilities

  • Advanced Japanese text generation
  • Language modeling with low perplexity
  • Efficient tokenization using sentencepiece
  • Compatible with Hugging Face's transformers library

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its specific optimization for Japanese language processing, combining both CC-100 and Wikipedia training data with a custom sentencepiece tokenizer, making it particularly effective for Japanese text generation tasks.

Q: What are the recommended use cases?

The model is well-suited for Japanese text generation tasks, language modeling, and general Japanese NLP applications. It's particularly useful for researchers and developers working on Japanese language AI applications.

Related Models

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026