MobileLLM-350M-layer-share

Property	Value
Parameter Count	345.3M
License	CC-BY-NC-4.0
Architecture	32 layers, 15 attention heads
Paper	MobileLLM Paper
Training Duration	~6 days on 32 A100 GPUs

What is MobileLLM-350M-layer-share?

MobileLLM-350M-layer-share is an innovative language model specifically engineered for on-device applications, developed by Meta. It represents a significant advancement in efficient AI model design, achieving superior performance while maintaining a compact architecture suitable for resource-constrained environments.

Implementation Details

The model features a sophisticated architecture with 32 layers, 15 attention heads, and 5 KV heads, operating with a token dimension of 960. It's trained on 1T tokens of publicly available online data and supports a context length of 2k tokens.

Utilizes SwiGLU activation function for enhanced performance
Implements grouped-query attention for efficiency
Features deep and thin architecture design
Employs embedding sharing technique

Core Capabilities

Achieves 52.1% accuracy on zero-shot commonsense reasoning tasks
Supports text generation and understanding tasks
Optimized for on-device deployment
Demonstrates superior performance compared to similar-sized models

Frequently Asked Questions

Q: What makes this model unique?

MobileLLM-350M stands out for its optimized architecture that achieves a 4.3% accuracy boost over previous state-of-the-art models of similar size on zero-shot commonsense reasoning tasks, while maintaining efficiency for on-device applications.

Q: What are the recommended use cases?

The model is specifically designed for on-device applications where resource constraints are important. It's suitable for text generation, understanding tasks, and applications requiring efficient natural language processing capabilities within limited computational resources.