mpt-1b-redpajama-200b-dolly

mpt-1b-redpajama-200b-dolly

mosaicml

A 1.3B parameter decoder-only transformer pre-trained on RedPajama dataset and fine-tuned on Databricks Dolly, optimized with FlashAttention and ALIBI

PropertyValue
Parameter Count1.3 Billion
LicenseCC-BY-SA-3.0
Release DateApril 20, 2023
Training DataRedPajama + Dolly Dataset

What is mpt-1b-redpajama-200b-dolly?

MPT-1B-RedPajama-200B-Dolly is a sophisticated language model developed by MosaicML that combines efficient architecture with comprehensive training. This 1.3 billion parameter decoder-only transformer represents a significant advancement in accessible AI models, having been pre-trained on the RedPajama dataset for 200B tokens and fine-tuned on the Databricks Dolly instruction dataset.

Implementation Details

The model features a modified transformer architecture with 24 layers, 16 attention heads, and width 2048. Its implementation incorporates several cutting-edge optimizations:

  • Uses ALiBi positional encoding instead of traditional positional embeddings
  • Implements QK LayerNorm for enhanced stability
  • Operates without biases for improved efficiency
  • Supports FlashAttention with Triton implementation

Core Capabilities

  • Text generation with instruction-following abilities
  • Efficient processing with FlashAttention support
  • Handles sequence lengths up to 2048 tokens
  • Compatible with PyTorch and Transformers library

Frequently Asked Questions

Q: What makes this model unique?

The model combines efficient architecture modifications like ALiBi and FlashAttention with comprehensive pre-training on the RedPajama dataset, followed by instruction fine-tuning. This makes it particularly suitable for practical applications while maintaining reasonable computational requirements.

Q: What are the recommended use cases?

The model is well-suited for text generation tasks, particularly those requiring instruction following. Its moderate size makes it practical for deployment in production environments where larger models might be impractical.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026