TowerInstruct-7B-v0.2
Property | Value |
---|---|
Parameter Count | 7 Billion |
License | CC-BY-NC-4.0 / LLAMA 2 Community License |
Authors | Unbabel, Instituto Superior Técnico, CentraleSupélec University |
Languages Supported | English, Portuguese, Spanish, French, German, Dutch, Italian, Korean, Chinese, Russian |
What is TowerInstruct-7B-v0.2?
TowerInstruct-7B-v0.2 is an advanced multilingual language model specifically designed for translation-related tasks. Built upon TowerBase, this model represents a significant improvement over its predecessor, particularly in document-level translation capabilities. It's the result of collaborative work between Unbabel and leading academic institutions, fine-tuned on the specialized TowerBlocks dataset.
Implementation Details
The model utilizes a 7B parameter architecture and implements the ChatML prompt template format. It's trained with specific hyperparameters including a batch size of 256, learning rate of 7e-06, and cosine scheduler with 500 warmup steps. The model supports sequences up to 2048 tokens and was trained for 4 epochs using the Adam optimizer.
- Specialized fine-tuning on TowerBlocks dataset
- Supports both sentence and paragraph-level translation
- Implements ChatML prompt format without system prompts
- Optimized for translation-specific tasks and multilingual capabilities
Core Capabilities
- General machine translation across 10 languages
- Automatic post-edition
- Named-entity recognition
- Grammatical error correction
- Context-aware translation
- Terminology-aware translation
- Paraphrase generation
Frequently Asked Questions
Q: What makes this model unique?
TowerInstruct-7B-v0.2 stands out for its specialized focus on translation-related tasks and improved document-level translation capabilities compared to v0.1. It's trained on a diverse range of translation-specific datasets and supports multiple language pairs with high efficiency.
Q: What are the recommended use cases?
The model is best suited for translation tasks, post-editing, named-entity recognition, and grammar correction across its supported languages. While it includes some conversational and code instruction capabilities, it's not primarily intended for use as a chatbot or code assistant.