Brief-details: BanglaT5 - A 247M parameter seq2seq transformer for Bengali NLP tasks, achieving SOTA results in translation, summarization, and QA tasks
BRIEF-DETAILS: ScholarBERT is a BERT-large variant with 340M parameters, trained on 221B tokens from scientific literature, optimized for academic text processing and analysis.
Brief-details: African language-specialized XLM-R-large model, fine-tuned on 17 African languages, achieving 83.9% avg F-score on NER tasks. Optimized for cross-lingual transfer.
BRIEF-DETAILS: Multilingual text embedding model based on MiniLM architecture, optimized for cross-lingual information retrieval and semantic search tasks.
BRIEF-DETAILS: OPT-350M model fine-tuned for email generation, trained on AESLC dataset. 350M parameters, supports efficient email completion with 64-token generation limit.
Brief Details: EmTract is a DistilBERT-based emotion detection model specialized for financial social media, trained on 250K texts across 7 emotions with additional StockTwits data.
BRIEF DETAILS: Korean-specific T5 model (250M params) trained on Korean wiki data using BBPE tokenization. Achieves strong performance on KLUE tasks after fine-tuning.
BRIEF DETAILS: BETO-based Spanish sentiment analysis model fine-tuned on 50K movie reviews, achieving 91% accuracy for binary classification tasks.
Brief-details: Finnish T5-based model for text correction, trained on 300k samples from Finnish news and Wikipedia. Achieves 1.1% median CER.
Brief Details: RRG_scorers is a specialized model developed by StanfordAIMI for medical scoring applications, available through Hugging Face.
Brief Details: CycleGAN model designed to transform images between GTA-style graphics and real-world photos, developed by Jorgvt and hosted on Hugging Face.
Brief-details: VITS-based text-to-speech model trained on LJSpeech dataset for English language synthesis, developed by neongeckocom for high-quality voice generation.
Brief-details: ChemGPT-1.2B: A GPT-Neo based molecular modeling transformer, trained on PubChem10M for generative chemistry tasks. Specializes in SMILES/SELFIES molecular representation.
Brief Details: Qwen2.5-Coder-1.5B-Instruct-AWQ is a 4-bit quantized code-specific LLM with 1.54B parameters, 32K context, and specialized coding capabilities.
BRIEF-DETAILS: Longformer-large-4096 is an efficient transformer model by Allen AI that handles long documents up to 4096 tokens with attention mechanism innovations
Brief-details: A Hugging Face transformers model by MrezaPRZ, likely based on Qwen architecture. Limited documentation available but appears focused on picking/selection tasks.
Brief Details: MERT-v0-public is a 95M parameter music understanding model trained on open-source audio data using MLM paradigm, featuring 12 transformer layers and 768-dimensional outputs.
Brief-details: A compact BERT variant with 6 layers and 768 hidden dimensions, based on Google's research for efficient transformer architectures
Brief Details: Llama-2-70b-hf is Meta's largest language model with 70B parameters, offering state-of-the-art performance for various NLP tasks via Hugging Face.
Brief-details: Advanced 8B multimodal LLM with optimized vision-language capabilities, featuring Mixed Preference Optimization and superior performance across visual reasoning tasks
Brief-details: 103B parameter AWQ-quantized LLM optimized for roleplay and storytelling. Features 120 layers, uncensored output, and strong performance at 8192 context window.