mt5-small-sum-de-mit-v1

Property	Value
Base Model	google/mt5-small
Task	German Text Summarization
License	MIT
Developer	Deutsche Telekom AG
Training Dataset	SwissText 2019 (84,564 examples)

What is mt5-small-sum-de-mit-v1?

mt5-small-sum-de-mit-v1 is a specialized German text summarization model developed by Deutsche Telekom's One Conversation team. Built upon Google's multilingual T5 small architecture, it stands out for its permissive MIT license, making it suitable for commercial applications. The model has been specifically optimized for generating concise German language summaries with a maximum length of 96 tokens.

Implementation Details

The model was trained with careful attention to hyperparameter optimization, including a batch size of 3 (6 with gradient accumulation), maximum source length of 800 tokens, and a learning rate of 5e-5 over 10 epochs. The training process incorporated a warmup ratio of 0.3 and used the prefix "summarize: " for input formatting.

Trained on SwissText 2019 dataset with 84,564 examples
Uses google/mt5-small tokenizer with 94-token summary limit
Achieves ROUGE scores: ROUGE-1: 16.80, ROUGE-2: 3.55, ROUGE-L: 12.69
Implements gradient accumulation steps of 2 for stable training

Core Capabilities

German text summarization with controlled output length
Commercial-friendly licensing under MIT
Efficient processing with mt5-small architecture
Balanced performance for resource-conscious applications

Frequently Asked Questions

Q: What makes this model unique?

The model's key differentiation is its MIT license, which allows unrestricted commercial use - unusual for many language models. Additionally, it's specifically optimized for German summarization tasks with controlled output length.

Q: What are the recommended use cases?

The model is ideal for automated German text summarization in commercial applications, content aggregation systems, and news digest generation where output length control is important. Its smaller size makes it suitable for deployment in resource-constrained environments.