sae-llama-3.1-8b-64x

Property	Value
Author	EleutherAI
Base Model	Llama 3.1 8B
Training Data	RedPajama v2 (8.5B tokens)
Model Hub	HuggingFace

What is sae-llama-3.1-8b-64x?

sae-llama-3.1-8b-64x is a collection of sparse autoencoders (SAEs) specifically trained on the Llama 3.1 8B model. These autoencoders were trained using a massive 10B sample from the RedPajama v2 corpus, which translates to approximately 8.5B tokens when processed through the Llama 3 tokenizer. The model represents an advanced approach to neural network compression and interpretation.

Implementation Details

The model implements the innovative MultiTopK loss function, which sets it apart from its 32x counterpart. This implementation allows for flexible sparsity levels during inference, making it highly adaptable to different computational requirements. The SAEs are organized by hookpoint and can be easily accessed using EleutherAI's SAE library.

Trained on Llama 3.1 8B architecture
Utilizes MultiTopK loss for variable sparsity
Organized by hookpoint for easy access
Compatible with EleutherAI's SAE library

Core Capabilities

Variable sparsity levels at inference time
Efficient model compression
Easy integration through Python library
Hookpoint-based organization for targeted analysis

Frequently Asked Questions

Q: What makes this model unique?

This model's implementation of MultiTopK loss sets it apart, allowing for flexible sparsity levels during inference - a feature not present in traditional SAEs. This makes it particularly versatile for different computational requirements and use cases.

Q: What are the recommended use cases?

The model is ideal for researchers and developers working on model interpretation, neural network compression, and those requiring flexible sparsity levels in their applications. It's particularly useful for analyzing and understanding the internal representations of the Llama 3.1 8B model.