pop2piano

Maintained By
sweetcocoa

Pop2Piano

PropertyValue
AuthorsJongho Choi and Kyogu Lee
PaperPop2Piano: Pop Audio-based Piano Cover Generation
ArchitectureT5-based Encoder-Decoder Transformer
Downloads9,489

What is pop2piano?

Pop2Piano is a groundbreaking transformer-based model that automatically generates piano covers from pop music audio. Unlike traditional approaches, it directly processes the audio waveform without requiring intermediate steps like melody or chord extraction, making it the first of its kind in the field of music transformation.

Implementation Details

The model utilizes a T5-based encoder-decoder architecture where the input audio waveform is processed by the encoder to create latent representations. The decoder then generates token sequences representing time, velocity, note, and special tokens, which are ultimately converted into MIDI format piano covers.

  • Direct audio-to-MIDI conversion capability
  • Autoregressive token generation
  • Multiple composer styles available
  • Optimized for 44.1 kHz sampling rate

Core Capabilities

  • Transform pop music audio into piano covers
  • Support for various musical styles including Korean Pop and Western Pop
  • Generate different interpretations using different composer settings
  • Process high-quality audio input (44.1 kHz)

Frequently Asked Questions

Q: What makes this model unique?

Pop2Piano is the first model to directly generate piano covers from pop audio without requiring intermediate melody and chord extraction steps, simplifying the entire conversion process.

Q: What are the recommended use cases?

The model is ideal for musicians, content creators, and music enthusiasts who want to create piano versions of pop songs. It works particularly well with Korean Pop music but also performs effectively with Western Pop and Hip Hop.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.