MusicGen Stereo Small

Property	Value
Parameters	300M
Developer	Meta AI (FAIR team)
Release Date	2023
License	Code: MIT, Weights: CC-BY-NC 4.0
Paper	Simple and Controllable Music Generation

What is musicgen-stereo-small?

MusicGen Stereo Small is a specialized text-to-music generation model that produces stereophonic audio output. It's a fine-tuned version of the original MusicGen small model, specifically adapted to create stereo music with enhanced spatial depth and directionality. The model operates at 32kHz with 4 codebooks and uses an innovative approach to generate stereo audio streams.

Implementation Details

The model is built on a single-stage auto-regressive Transformer architecture that works in conjunction with an EnCodec tokenizer. It processes two separate audio streams for stereo output, interleaving them using a delay pattern. The model generates all 4 codebooks in one pass, requiring only 50 auto-regressive steps per second of audio.

32kHz sampling rate with EnCodec tokenization
4 codebooks sampled at 50 Hz
Parallel prediction capability through small delays between codebooks
Trained on licensed music data from Meta Music Initiative, Shutterstock, and Pond5

Core Capabilities

High-quality stereo music generation from text descriptions
Support for various music styles and genres
50Hz token generation rate
Integration with popular ML frameworks like 🤗 Transformers
Efficient parallel processing design

Frequently Asked Questions

Q: What makes this model unique?

This model is unique in its ability to generate true stereophonic audio without requiring a self-supervised semantic representation, unlike competitors such as MusicLM. It's also one of the few models specifically designed for stereo music generation.

Q: What are the recommended use cases?

The model is primarily intended for research purposes in AI-based music generation, including understanding generative model limitations and exploring text-guided music creation. It's not recommended for commercial applications without proper risk evaluation and mitigation.