MERT-v0-public

Property	Value
Parameter Count	95M
Model Type	Music Understanding Model
Architecture	12-layer Transformer with 768-dimensional features
Training Data	900 hours of open-source music
Sample Rate	16KHz
Feature Rate	50Hz
Paper	arXiv:2306.00107

What is MERT-v0-public?

MERT-v0-public is a completely unsupervised music audio understanding model trained exclusively on non-commercial open-source datasets, including Music4All and a filtered version of FMA_full. It represents part of the m-a-p model family and employs the Masked Language Model (MLM) paradigm for training.

Implementation Details

The model features a transformer-based architecture with 12 layers and 768-dimensional feature outputs. It processes audio at 16KHz and generates features at 50Hz (50 features per second). The model can handle 5-second context windows during pre-training and outputs 13 layers of representations (including the input layer) that can be used for various downstream tasks.

Pre-trained using MLM paradigm on 900 hours of music
Supports variable length audio inputs
Generates layer-wise representations suitable for different tasks
Implements efficient feature extraction and processing

Core Capabilities

Music audio representation learning
Unsupervised feature extraction
Support for downstream music understanding tasks
Time-based and utterance-level classification
Flexible feature aggregation options

Frequently Asked Questions

Q: What makes this model unique?

MERT-v0-public stands out for being trained entirely on open-source, non-commercial music data, making it particularly suitable for academic and research applications. Its MLM training paradigm and layered architecture allow for flexible feature extraction at different levels of abstraction.

Q: What are the recommended use cases?

The model is well-suited for various music understanding tasks, including music classification, feature extraction, and analysis. It can be used for both time-based analysis and utterance-level classification, with the ability to leverage different transformer layers for task-specific optimizations.

MERT-v0-public

MERT-v0-public

What is MERT-v0-public?

Implementation Details

Core Capabilities

Frequently Asked Questions

Q: What makes this model unique?

Q: What are the recommended use cases?

Related Models