xm_transformer_unity_en-hk

Maintained By
facebook

xm_transformer_unity_en-hk

PropertyValue
Licensecc-by-nc-4.0
FrameworkFairseq
Task TypeSpeech-to-Speech Translation
DatasetMuST-C

What is xm_transformer_unity_en-hk?

The xm_transformer_unity_en-hk is a sophisticated speech-to-speech translation model developed by Facebook that directly converts English speech into Hokkien speech. It utilizes a two-pass decoder system called UnitY and is specifically trained on both supervised TED domain data and weakly supervised data from TED and Audiobook domains.

Implementation Details

This model implements a complex pipeline that combines speech recognition and synthesis. It uses the facebook/unit_hifigan_HK_layer12.km2500_frame_TAT-TTS for speech synthesis and requires 16000Hz mono channel audio input. The implementation leverages the Fairseq framework and includes comprehensive audio processing capabilities.

  • Two-pass decoder architecture with UnitY system
  • Integrated speech synthesis using HiFiGAN vocoder
  • Support for both TED and Audiobook domain translations
  • Direct speech-to-speech conversion without intermediate text representation

Core Capabilities

  • Direct English to Hokkien speech translation
  • High-quality speech synthesis using specialized vocoder
  • Processing of 16kHz mono channel audio
  • Support for both supervised and weakly supervised training data

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its ability to perform direct speech-to-speech translation between English and Hokkien, a language pair that traditionally has limited resources. The two-pass decoder system and integration with specialized vocoders make it particularly effective for real-world applications.

Q: What are the recommended use cases?

The model is ideal for applications requiring English to Hokkien translation in TED-talk style content and audiobook contexts. It's particularly suitable for scenarios where direct speech output is needed without intermediate text representation.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.