FRED-T5-1.7B

Maintained By
ai-forever

FRED-T5-1.7B

PropertyValue
Model Size1.7B parameters
ArchitectureT5-based (24 layers, 1536 hidden size)
Training Data300GB Russian language corpus
LicenseApache 2.0
Research PaperView Paper

What is FRED-T5-1.7B?

FRED-T5-1.7B (Full-scale Russian Enhanced Denoisers T5) is a sophisticated language model developed by SberDevices, specifically designed for Russian language processing. The model represents a significant advancement in Russian NLP, trained using a mixture of 7 denoisers similar to UL2 architecture.

Implementation Details

The model features a bpe tokenizer with 50,257 base tokens plus 107 special tokens, including specific prefix tokens like '<LM>', '<SC1>' through '<SC6>'. The training process was conducted over approximately 45 days using 112 A100 GPUs, with a unique two-phase approach where the initial training focused on a smaller dataset subset.

  • 24-layer architecture with 1536 hidden size
  • Trained on 300GB Russian language corpus
  • Implements multiple denoising objectives
  • Specialized prefix tokens for different tasks

Core Capabilities

  • Advanced text generation in Russian
  • Multiple denoising tasks support
  • Flexible prefix-based task conditioning
  • Russian SuperGLUE benchmark optimization

Frequently Asked Questions

Q: What makes this model unique?

The model's distinctive feature is its specialized training approach combining 7 denoisers with a two-phase training strategy, specifically optimized for Russian language tasks. The implementation of multiple prefix tokens allows for versatile task handling.

Q: What are the recommended use cases?

FRED-T5-1.7B is particularly well-suited for Russian language processing tasks, including text generation, completion, and various conditional generation tasks. Its multiple prefix tokens make it adaptable for different NLP applications.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.