deberta-v2-base-japanese

Maintained By
ku-nlp

DeBERTa V2 Base Japanese

PropertyValue
Parameter Count137M
LicenseCC-BY-SA-4.0
Training DataWikipedia, CC-100, OSCAR
MLM Accuracy0.779

What is deberta-v2-base-japanese?

The deberta-v2-base-japanese is a sophisticated Japanese language model based on the DeBERTa V2 architecture, specifically trained on a massive corpus of Japanese text. Developed by ku-nlp, this model represents a significant advancement in Japanese natural language processing, incorporating 137 million parameters and achieving impressive performance across various NLP tasks.

Implementation Details

The model was trained on a combined 171GB dataset comprising Japanese Wikipedia, CC-100, and OSCAR corpora. Training utilized 8 NVIDIA A100-SXM4-40GB GPUs over three weeks, implementing advanced tokenization through Juman++ and sentencepiece with 32,000 tokens.

  • Training batch size: 2,112
  • Learning rate: 2e-4
  • Training steps: 500,000
  • Sequence length: 512

Core Capabilities

  • Masked Language Modeling with 0.779 accuracy
  • Strong performance on JGLUE benchmark tasks
  • Specialized for Japanese text processing
  • Support for fine-tuning on downstream tasks

Frequently Asked Questions

Q: What makes this model unique?

This model combines DeBERTa V2's advanced architecture with comprehensive Japanese language training, utilizing a unique word segmentation approach through Juman++ and achieving competitive performance on various Japanese NLP benchmarks.

Q: What are the recommended use cases?

The model excels in masked language modeling tasks and can be fine-tuned for various downstream applications including sentiment analysis (MARC-ja), textual similarity (JSTS), natural language inference (JNLI), and question answering (JSQuAD, JComQA).

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.