deberta-v1-base

Maintained By
deepvk

DeBERTa-v1-base

PropertyValue
Parameter Count124M
LicenseApache 2.0
LanguagesRussian, English
Training Data400GB text corpus
Authordeepvk

What is deberta-v1-base?

DeBERTa-v1-base is a powerful pretrained bidirectional encoder specifically designed for Russian language processing. Developed by deepvk, this model represents a significant advancement in Russian language understanding, trained on a massive 400GB dataset including diverse sources like Wikipedia, books, social media, and news content.

Implementation Details

The model features a sophisticated architecture with 12 encoder layers, 12 attention heads, and an embedding dimension of 768. It utilizes the GeLU activation function and implements byte-level BPE tokenization with a vocabulary size of 50,266. Training was conducted using mixed FP16 precision on 8xA100 GPUs over approximately 30 days.

  • 12 encoder layers with 12 attention heads
  • 768 dimensional embeddings with 3,072 FFN dimension
  • Trained with AdamW optimizer and linear learning rate scheduler
  • Implements sophisticated deduplication using MinHash algorithm

Core Capabilities

  • Strong performance on Russian SuperGLUE benchmark tasks
  • Effective feature extraction for Russian language understanding
  • Maximum sequence length of 512 tokens
  • Robust handling of both Russian and English text

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its extensive pretraining on a carefully curated and deduplicated 400GB Russian language corpus, achieving state-of-the-art results on multiple Russian SuperGLUE tasks, particularly excelling in PARus and MuSeRC benchmarks.

Q: What are the recommended use cases?

This model is particularly well-suited for feature extraction tasks in Russian language processing, including text classification, semantic analysis, and general language understanding tasks. It's designed as an encoder-only model without any pretrained head, making it versatile for various downstream tasks.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.