musicgen-stereo-small

Maintained By
facebook

MusicGen Stereo Small

PropertyValue
Parameters300M
DeveloperMeta AI (FAIR team)
Release Date2023
LicenseCode: MIT, Weights: CC-BY-NC 4.0
PaperSimple and Controllable Music Generation

What is musicgen-stereo-small?

MusicGen Stereo Small is a specialized text-to-music generation model that produces stereophonic audio output. It's a fine-tuned version of the original MusicGen small model, specifically adapted to create stereo music with enhanced spatial depth and directionality. The model operates at 32kHz with 4 codebooks and uses an innovative approach to generate stereo audio streams.

Implementation Details

The model is built on a single-stage auto-regressive Transformer architecture that works in conjunction with an EnCodec tokenizer. It processes two separate audio streams for stereo output, interleaving them using a delay pattern. The model generates all 4 codebooks in one pass, requiring only 50 auto-regressive steps per second of audio.

  • 32kHz sampling rate with EnCodec tokenization
  • 4 codebooks sampled at 50 Hz
  • Parallel prediction capability through small delays between codebooks
  • Trained on licensed music data from Meta Music Initiative, Shutterstock, and Pond5

Core Capabilities

  • High-quality stereo music generation from text descriptions
  • Support for various music styles and genres
  • 50Hz token generation rate
  • Integration with popular ML frameworks like 🤗 Transformers
  • Efficient parallel processing design

Frequently Asked Questions

Q: What makes this model unique?

This model is unique in its ability to generate true stereophonic audio without requiring a self-supervised semantic representation, unlike competitors such as MusicLM. It's also one of the few models specifically designed for stereo music generation.

Q: What are the recommended use cases?

The model is primarily intended for research purposes in AI-based music generation, including understanding generative model limitations and exploring text-guided music creation. It's not recommended for commercial applications without proper risk evaluation and mitigation.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.