pythia-6.9b-deduped

pythia-6.9b-deduped

EleutherAI

A 6.9B parameter language model from EleutherAI's Pythia suite, trained on deduplicated Pile dataset for interpretability research and scientific analysis.

PropertyValue
Parameter Count6.9B (6,857,302,016 total)
Model TypeTransformer-based Language Model
LicenseApache 2.0
PaperLink
Training DataThe Pile (Deduplicated)

What is pythia-6.9b-deduped?

Pythia-6.9B-deduped is a large language model that's part of EleutherAI's Pythia Scaling Suite, specifically designed for interpretability research. This model features 32 layers, 4096 model dimension, and 32 attention heads, trained on a deduplicated version of The Pile dataset.

Implementation Details

The model is built using the GPT-NeoX architecture and trained with a batch size of 2M tokens, using a learning rate of 1.2 x 10⁻⁴. It was trained for 143,000 steps, seeing approximately 299.9B tokens during training.

  • Architecture: 32 transformer layers with 4096 dimensional states
  • Attention Heads: 32
  • Non-embedding Parameters: 6,444,163,072
  • Training Dataset: Deduplicated version of The Pile

Core Capabilities

  • English language text generation and completion
  • Scientific research and model interpretability studies
  • Supports checkpoint analysis with 154 intermediate checkpoints
  • Compatible with Hugging Face Transformers library

Frequently Asked Questions

Q: What makes this model unique?

This model is part of a carefully controlled experimental suite designed for interpretability research, trained on deduplicated data with precise checkpointing, making it valuable for studying model behavior and development.

Q: What are the recommended use cases?

The model is primarily intended for research purposes, particularly in studying language model behavior and interpretability. It's not recommended for deployment in production environments or direct user-facing applications.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026