BRIEF-DETAILS: Jukebox-5b-lyrics is OpenAI's 5 billion parameter music generation model specialized in creating music with coherent lyrics.
Brief-details: Zero-shot image classification model based on CLIP, allowing flexible category matching without training. Ideal for quick image classification tasks with customizable labels.
Brief Details: BERT-based classifier fine-tuned for sequence classification tasks, achieving 99.94% accuracy with optimized hyperparameters and linear learning rate scheduling.
Brief Details: A BART-based paraphrasing model trained on ParaBank 2 dataset, specializing in high-quality sentence-level paraphrasing with controllable diversity.
Brief-details: BLIP image captioning model customized for Hugging Face Inference Endpoints, supporting various decoding strategies like beam search, nucleus sampling, and contrastive search.
Brief Details: A Vision Transformer (ViT) based model specialized in dog breed classification, developed by skyau on Hugging Face for accurate canine breed identification.
Brief-details: A grapheme-to-phoneme conversion model by SpeechBrain that converts text to phonetic representations with semantic disambiguation, trained on LibriG2P data.
BRIEF-DETAILS: French GPT model specialized in text-to-image generation, trained on 10.8M text-image pairs, utilizing VQGAN architecture for high-quality image synthesis.
Brief Details: T5-large model finetuned on XSUM-CNN datasets for summarization, ranking 3rd on XSUM leaderboard with strong ROUGE scores.
BRIEF-DETAILS: German DistilBERT model fine-tuned for text complexity prediction on 1-7 scale, trained on Naderi dataset, published at KONVENS 2022
BRIEF-DETAILS: Multilingual STT model supporting Luxembourgish, German, French, English, and Portuguese, built with Coqui-STT v1.3.0. Features custom dataset and specialized alphabet support.
Brief Details: A fine-tuned DETR-ResNet-50 model specialized in detecting illustrations in historical chapbooks, optimized for digital humanities workflows.
Brief-details: A text-to-speech VITS model trained on Common Voice Georgian language dataset by NeonGecko, offering neural voice synthesis capabilities for Georgian language.
Brief-details: A streaming-capable ASR model trained on WenetSpeech, using pruned transducer architecture for efficient stateless inference in real-time applications.
BRIEF-DETAILS: XLM-RoBERTa model fine-tuned for Urdu sentiment analysis, trained on 2.5T of data across 100 languages, achieving SOTA results in cross-lingual tasks.
Brief-details: A state-of-the-art diffusion model by Google for high-quality image generation, achieving FID score of 3.17 on CIFAR10, specialized in cat image synthesis at 256x256 resolution.
Brief Details: DDPM model trained on church images (256x256). High-quality image synthesis using diffusion models with state-of-the-art FID scores. Supports multiple noise schedulers.
Brief-details: DDPM model specialized in bedroom image generation at 256x256 resolution, achieving quality comparable to ProgressiveGAN. Supports multiple noise schedulers for flexible inference.
Brief-details: DDPM model trained on church images (256x256), offering high-quality image synthesis using diffusion probabilistic models. Features multiple scheduler options for inference.
Brief details: A high-quality image generation model using DDPM architecture, trained on CelebA-HQ dataset for 256x256 face synthesis with multiple scheduling options.
Brief-details: T5-base reranker model fine-tuned on MS MARCO passage dataset, optimized for document ranking with 100k training steps. Ideal for search and retrieval tasks.