Brief-details: Text-to-speech Transformer model for Russian language, trained on Common Voice v7 and CSS10 datasets. Single-speaker male voice, developed by Facebook.
BRIEF-DETAILS: Facebook's speech-to-text transformer model with 29.5M parameters, trained on LibriSpeech ASR. Achieves 4.3% WER on clean test data, optimized for English ASR tasks.
Brief-details: Multilingual speech-to-text transformer model supporting 9 languages, trained on MuST-C dataset with strong BLEU scores (24.5-34.9) for European languages.
Brief Details: Facebook's speech recognition model using data2vec framework. Achieves 2.77 WER on LibriSpeech clean test. Self-supervised learning for 16kHz audio.
BRIEF DETAILS: RAG-Token base model for knowledge-intensive NLP tasks, combining question encoding and generation capabilities. Built on DPR and BART architectures with Apache 2.0 license.
BRIEF DETAILS: RAG-Sequence model for knowledge-intensive NLP tasks, combining question encoding and generation capabilities. Built on BART-large with retrieval augmentation.
Brief Details: DETR object detection model with ResNet-50 backbone, 41.6M parameters. Achieves 43.3 AP on COCO. Transformer-based architecture for end-to-end object detection.
BRIEF DETAILS: ConvNeXT base model optimized for 224x224 images, combining CNN architecture with Transformer-inspired design. Achieves state-of-the-art performance on ImageNet-1k classification.
Brief Details: EDSR is a powerful image super-resolution model using enhanced deep residual networks, capable of 2x-4x upscaling with state-of-the-art PSNR/SSIM metrics.
Brief Details: BART-based paraphrasing model with 406M parameters, fine-tuned on Quora, PAWS, and MSR datasets. Specializes in text-to-text generation.
Brief Details: GuwenBERT-large is a specialized RoBERTa model for Classical Chinese text processing, trained on 1.7B characters from ancient texts. Excels in tasks like NER with 84.63% F1 score.
Brief-details: Japanese ASR model trained on JSUT dataset using ESPnet's Conformer architecture. Focuses on character-level speech recognition with specialized raw audio processing.
Brief Details: GuwenBERT is a specialized RoBERTa-based model for Classical Chinese text processing, trained on 1.7B characters from ancient texts. Achieves strong NER performance with 84.63% F1 score.
Brief-details: ESPnet-based speech translation model trained on IWSLT22 dialect dataset, using Conformer architecture with CTC loss weight 0.3 and specialized specaugment techniques.
Brief Details: A specialized BERT model trained on medical discharge summaries, initialized from BioBERT and fine-tuned on MIMIC III clinical data. Optimized for clinical NLP tasks.
Brief Details: A fine-tuned XLSR-53 model for Arabic speech recognition, achieving 26.55% WER on Common Voice test set. Supports 16kHz audio input.
Brief-details: German ELECTRA-based question-answering model trained on 130k QA pairs, achieving 70.97% exact match accuracy. Optimized for German text comprehension.
Brief Details: A Spanish-optimized sentence similarity model fine-tuned on stsb_multi_mt dataset, achieving 0.82 Pearson correlation for semantic similarity tasks
Brief Details: KenLM is a probabilistic n-gram language model supporting 24 languages, specialized in perplexity estimation for text filtering and sampling with MIT license.
Brief-details: Spanish Sentence Similarity model based on DistilBERT, fine-tuned on STS Benchmark dataset, achieving 0.74 Pearson correlation for semantic analysis.
Brief Details: A German question generation model fine-tuned on drink-related content, achieving BLEU-4 score of 29.80 on drink600 dataset. Built on T5 architecture with MIT license.