mt5-small-sum-de-mit-v1
Property | Value |
---|---|
Base Model | google/mt5-small |
Task | German Text Summarization |
License | MIT |
Developer | Deutsche Telekom AG |
Training Dataset | SwissText 2019 (84,564 examples) |
What is mt5-small-sum-de-mit-v1?
mt5-small-sum-de-mit-v1 is a specialized German text summarization model developed by Deutsche Telekom's One Conversation team. Built upon Google's multilingual T5 small architecture, it stands out for its permissive MIT license, making it suitable for commercial applications. The model has been specifically optimized for generating concise German language summaries with a maximum length of 96 tokens.
Implementation Details
The model was trained with careful attention to hyperparameter optimization, including a batch size of 3 (6 with gradient accumulation), maximum source length of 800 tokens, and a learning rate of 5e-5 over 10 epochs. The training process incorporated a warmup ratio of 0.3 and used the prefix "summarize: " for input formatting.
- Trained on SwissText 2019 dataset with 84,564 examples
- Uses google/mt5-small tokenizer with 94-token summary limit
- Achieves ROUGE scores: ROUGE-1: 16.80, ROUGE-2: 3.55, ROUGE-L: 12.69
- Implements gradient accumulation steps of 2 for stable training
Core Capabilities
- German text summarization with controlled output length
- Commercial-friendly licensing under MIT
- Efficient processing with mt5-small architecture
- Balanced performance for resource-conscious applications
Frequently Asked Questions
Q: What makes this model unique?
The model's key differentiation is its MIT license, which allows unrestricted commercial use - unusual for many language models. Additionally, it's specifically optimized for German summarization tasks with controlled output length.
Q: What are the recommended use cases?
The model is ideal for automated German text summarization in commercial applications, content aggregation systems, and news digest generation where output length control is important. Its smaller size makes it suitable for deployment in resource-constrained environments.