MobileLLM-350M-layer-share
Property | Value |
---|---|
Parameter Count | 345.3M |
License | CC-BY-NC-4.0 |
Architecture | 32 layers, 15 attention heads |
Paper | MobileLLM Paper |
Training Duration | ~6 days on 32 A100 GPUs |
What is MobileLLM-350M-layer-share?
MobileLLM-350M-layer-share is an innovative language model specifically engineered for on-device applications, developed by Meta. It represents a significant advancement in efficient AI model design, achieving superior performance while maintaining a compact architecture suitable for resource-constrained environments.
Implementation Details
The model features a sophisticated architecture with 32 layers, 15 attention heads, and 5 KV heads, operating with a token dimension of 960. It's trained on 1T tokens of publicly available online data and supports a context length of 2k tokens.
- Utilizes SwiGLU activation function for enhanced performance
- Implements grouped-query attention for efficiency
- Features deep and thin architecture design
- Employs embedding sharing technique
Core Capabilities
- Achieves 52.1% accuracy on zero-shot commonsense reasoning tasks
- Supports text generation and understanding tasks
- Optimized for on-device deployment
- Demonstrates superior performance compared to similar-sized models
Frequently Asked Questions
Q: What makes this model unique?
MobileLLM-350M stands out for its optimized architecture that achieves a 4.3% accuracy boost over previous state-of-the-art models of similar size on zero-shot commonsense reasoning tasks, while maintaining efficiency for on-device applications.
Q: What are the recommended use cases?
The model is specifically designed for on-device applications where resource constraints are important. It's suitable for text generation, understanding tasks, and applications requiring efficient natural language processing capabilities within limited computational resources.