GZ_IsoTech
Property | Value |
---|---|
Dataset Size | 2,824 audio clips |
Best Performance | 85.5% (using ViT-L-16) |
Paper | CCMusic (TISMIR 2025) |
License | Research Use |
What is GZ_IsoTech?
GZ_IsoTech is a specialized AI model designed for recognizing and classifying various playing techniques of the Guzheng, a traditional Chinese musical instrument. The model was trained on a comprehensive dataset comprising 2,824 audio clips, including 2,328 synthesized samples and 496 recordings from a professional Guzheng artist.
Implementation Details
The model employs advanced feature extraction techniques and multiple backbone architectures, with the Vision Transformer (ViT-L-16) achieving the best performance. It processes audio input through three different representations: Mel-spectrogram, Constant-Q Transform (CQT), and Chroma features.
- Multiple backbone architectures tested (ViT, MaxViT, ResNext, ResNet, RegNet)
- Best performance achieved with ViT-L-16 (304.3M parameters)
- Supports three input features: Mel, CQT, and Chroma
Core Capabilities
- Recognition of 8 distinct Guzheng techniques: Vibrato, Slide-up, Slide-down, Return Slide, Glissando, Thumb Plucking, Harmonics, and Plucking Techniques
- High accuracy across different audio representations (85.5% on Mel features)
- Robust performance on both synthetic and real-world recordings
Frequently Asked Questions
Q: What makes this model unique?
This is the first comprehensive model specifically designed for Guzheng technique recognition, combining both synthetic and real-world data to ensure robust performance. It supports multiple input representations and achieves state-of-the-art accuracy in technique classification.
Q: What are the recommended use cases?
The model is ideal for music education, digital archiving of Guzheng performances, automated technique analysis, and research in traditional Chinese music. It can be integrated into learning platforms, music analysis software, and digital preservation systems.