MistralLite
Property | Value |
---|---|
Developer | Amazon |
Base Model | Mistral-7B-v0.1 |
Context Length | 32K tokens |
License | Apache 2.0 |
What is MistralLite?
MistralLite is an advanced fine-tuned version of Mistral-7B, specifically optimized for processing long context sequences up to 32K tokens. Developed by Amazon, it introduces significant improvements in handling extended text through adapted Rotary Embedding and an expanded sliding window during fine-tuning.
Implementation Details
The model leverages enhanced architectural features including a modified rope_theta value of 1000000 and an expanded sliding window size of 16384, compared to the original Mistral's 4096. These modifications enable superior performance on long-context tasks while maintaining the model's core architecture.
- Fine-tuned on SLED, Natural Questions, and OpenAssistant datasets
- Supports multiple serving frameworks including TGI, vLLM, and HuggingFace transformers
- Deployable on AWS g5.2x instances with SageMaker
Core Capabilities
- Achieves 98-100% accuracy on topic retrieval tasks up to 13,780 tokens
- Significantly improved line retrieval performance (60-98% accuracy)
- Enhanced question-answering capabilities with 64.4% accuracy on test sets
- Maintains strong performance on standard benchmarks (57.2% average on standard metrics)
Frequently Asked Questions
Q: What makes this model unique?
MistralLite's distinctive feature is its optimized performance on long-context tasks while maintaining the simple architecture of Mistral-7B. It significantly outperforms the base model in extended context scenarios while being deployable on single GPU instances.
Q: What are the recommended use cases?
The model excels in long-context applications including document analysis, multi-document question answering, extended summarization tasks, and semantic search across large text segments. It's particularly suitable for enterprise applications requiring efficient processing of lengthy documents.