DeBERTa V2 Base Japanese

Property	Value
Parameter Count	137M
License	CC-BY-SA-4.0
Training Data	Wikipedia, CC-100, OSCAR
MLM Accuracy	0.779

What is deberta-v2-base-japanese?

The deberta-v2-base-japanese is a sophisticated Japanese language model based on the DeBERTa V2 architecture, specifically trained on a massive corpus of Japanese text. Developed by ku-nlp, this model represents a significant advancement in Japanese natural language processing, incorporating 137 million parameters and achieving impressive performance across various NLP tasks.

Implementation Details

The model was trained on a combined 171GB dataset comprising Japanese Wikipedia, CC-100, and OSCAR corpora. Training utilized 8 NVIDIA A100-SXM4-40GB GPUs over three weeks, implementing advanced tokenization through Juman++ and sentencepiece with 32,000 tokens.

Training batch size: 2,112
Learning rate: 2e-4
Training steps: 500,000
Sequence length: 512

Core Capabilities

Masked Language Modeling with 0.779 accuracy
Strong performance on JGLUE benchmark tasks
Specialized for Japanese text processing
Support for fine-tuning on downstream tasks

Frequently Asked Questions

Q: What makes this model unique?

This model combines DeBERTa V2's advanced architecture with comprehensive Japanese language training, utilizing a unique word segmentation approach through Juman++ and achieving competitive performance on various Japanese NLP benchmarks.

Q: What are the recommended use cases?

The model excels in masked language modeling tasks and can be fine-tuned for various downstream applications including sentiment analysis (MARC-ja), textual similarity (JSTS), natural language inference (JNLI), and question answering (JSQuAD, JComQA).