mpt-30b-chat

mpt-30b-chat

mosaicml

MPT-30B-Chat: A 30B parameter chatbot model by MosaicML, fine-tuned on diverse datasets with 8K token context window and FlashAttention support.

PropertyValue
Parameter Count29.95B
LicenseCC-By-NC-SA-4.0
Context Length8192 tokens
ArchitectureModified decoder-only transformer
Release DateJune 22, 2023

What is MPT-30B-Chat?

MPT-30B-Chat is an advanced language model developed by MosaicML, designed specifically for dialogue generation and multi-turn conversations. Built by fine-tuning the base MPT-30B model on diverse datasets including ShareGPT-Vicuna, Camel-AI, GPTeacher, and others, it represents a significant advancement in open-source language models that outperforms the original GPT-3.

Implementation Details

The model employs a modified decoder-only transformer architecture with several innovative features that enhance its performance and efficiency. The architecture includes 48 layers, 64 attention heads, and a dimensional model size of 7168.

  • Implements FlashAttention for improved computational efficiency
  • Uses ALiBi (Attention with Linear Biases) instead of traditional positional embeddings
  • Features an 8K token context window with expansion capability
  • Trained on 64 H100s for approximately 7.6 hours

Core Capabilities

  • Excels at multi-turn conversations and dialogue generation
  • Strong coding abilities due to specialized pretraining data
  • Supports context-length extrapolation via ALiBi
  • Efficient inference and training performance
  • Handles complex instruction following tasks

Frequently Asked Questions

Q: What makes this model unique?

MPT-30B-Chat stands out due to its combination of size (29.95B parameters), efficient architecture featuring FlashAttention and ALiBi, and its diverse training data mix including high-quality conversational datasets. The model's ability to handle 8K token contexts while supporting further extension makes it particularly versatile.

Q: What are the recommended use cases?

The model is best suited for chatbot applications, multi-turn conversations, coding assistance, and general dialogue generation. However, it's important to note that it's licensed for non-commercial use only under CC-By-NC-SA-4.0.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026