MobileLLM-350M-layer-share

Maintained By
facebook

MobileLLM-350M-layer-share

PropertyValue
Parameter Count345.3M
LicenseCC-BY-NC-4.0
Architecture32 layers, 15 attention heads
PaperMobileLLM Paper
Training Duration~6 days on 32 A100 GPUs

What is MobileLLM-350M-layer-share?

MobileLLM-350M-layer-share is an innovative language model specifically engineered for on-device applications, developed by Meta. It represents a significant advancement in efficient AI model design, achieving superior performance while maintaining a compact architecture suitable for resource-constrained environments.

Implementation Details

The model features a sophisticated architecture with 32 layers, 15 attention heads, and 5 KV heads, operating with a token dimension of 960. It's trained on 1T tokens of publicly available online data and supports a context length of 2k tokens.

  • Utilizes SwiGLU activation function for enhanced performance
  • Implements grouped-query attention for efficiency
  • Features deep and thin architecture design
  • Employs embedding sharing technique

Core Capabilities

  • Achieves 52.1% accuracy on zero-shot commonsense reasoning tasks
  • Supports text generation and understanding tasks
  • Optimized for on-device deployment
  • Demonstrates superior performance compared to similar-sized models

Frequently Asked Questions

Q: What makes this model unique?

MobileLLM-350M stands out for its optimized architecture that achieves a 4.3% accuracy boost over previous state-of-the-art models of similar size on zero-shot commonsense reasoning tasks, while maintaining efficiency for on-device applications.

Q: What are the recommended use cases?

The model is specifically designed for on-device applications where resource constraints are important. It's suitable for text generation, understanding tasks, and applications requiring efficient natural language processing capabilities within limited computational resources.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.