Models

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

banglat5

Brief-details: BanglaT5 - A 247M parameter seq2seq transformer for Bengali NLP tasks, achieving SOTA results in translation, summarization, and QA tasks

globuslabs

ScholarBERT

BRIEF-DETAILS: ScholarBERT is a BERT-large variant with 340M parameters, trained on 221B tokens from scientific literature, optimized for academic text processing and analysis.

Davlan

afro-xlmr-large

Brief-details: African language-specialized XLM-R-large model, fine-tuned on 17 African languages, achieving 83.9% avg F-score on NER tasks. Optimized for cross-lingual transfer.

nreimers

mmarco-mMiniLMv2-L6-H384-v1

BRIEF-DETAILS: Multilingual text embedding model based on MiniLM architecture, optimized for cross-lingual information retrieval and semantic search tasks.

pszemraj

opt-350m-email-generation

BRIEF-DETAILS: OPT-350M model fine-tuned for email generation, trained on AESLC dataset. 350M parameters, supports efficient email completion with 64-token generation limit.

vamossyd

emtract-distilbert-base-uncased-emotion

Brief Details: EmTract is a DistilBERT-based emotion detection model specialized for financial social media, trained on 250K texts across 7 emotions with additional StockTwits data.

paust

pko-t5-base

BRIEF DETAILS: Korean-specific T5 model (250M params) trained on Korean wiki data using BBPE tokenization. Achieves strong performance on KLUE tasks after fine-tuning.

edumunozsala

beto_sentiment_analysis_es

BRIEF DETAILS: BETO-based Spanish sentiment analysis model fine-tuned on 50K movie reviews, achieving 91% accuracy for binary classification tasks.

Finnish-NLP

t5-small-nl24-casing-punctuation-correction

Brief-details: Finnish T5-based model for text correction, trained on 300k samples from Finnish news and Wikipedia. Achieves 1.1% median CER.

StanfordAIMI

RRG_scorers

Brief Details: RRG_scorers is a specialized model developed by StanfordAIMI for medical scoring applications, available through Hugging Face.

Jorgvt

CycleGAN_GTA_REAL

Brief Details: CycleGAN model designed to transform images between GTA-style graphics and real-world photos, developed by Jorgvt and hosted on Hugging Face.

neongeckocom

tts-vits-ljspeech-en

Brief-details: VITS-based text-to-speech model trained on LJSpeech dataset for English language synthesis, developed by neongeckocom for high-quality voice generation.

ncfrey

ChemGPT-1.2B

Brief-details: ChemGPT-1.2B: A GPT-Neo based molecular modeling transformer, trained on PubChem10M for generative chemistry tasks. Specializes in SMILES/SELFIES molecular representation.

Qwen

Qwen2.5-Coder-1.5B-Instruct-AWQ

Brief Details: Qwen2.5-Coder-1.5B-Instruct-AWQ is a 4-bit quantized code-specific LLM with 1.54B parameters, 32K context, and specialized coding capabilities.

allenai

longformer-large-4096

BRIEF-DETAILS: Longformer-large-4096 is an efficient transformer model by Allen AI that handles long documents up to 4096 tokens with attention mechanism innovations

MrezaPRZ

picker_qwen

Brief-details: A Hugging Face transformers model by MrezaPRZ, likely based on Qwen architecture. Limited documentation available but appears focused on picking/selection tasks.

m-a-p

MERT-v0-public

Brief Details: MERT-v0-public is a 95M parameter music understanding model trained on open-source audio data using MLM paradigm, featuring 12 transformer layers and 768-dimensional outputs.

gaunernst

bert-L6-H768-uncased

Brief-details: A compact BERT variant with 6 layers and 768 hidden dimensions, based on Google's research for efficient transformer architectures

meta-llama

Llama-2-70b-hf

Brief Details: Llama-2-70b-hf is Meta's largest language model with 70B parameters, offering state-of-the-art performance for various NLP tasks via Hugging Face.

OpenGVLab

InternVL2_5-8B-MPO-AWQ

Brief-details: Advanced 8B multimodal LLM with optimized vision-language capabilities, featuring Mixed Preference Optimization and superior performance across visual reasoning tasks

TheBloke

Rogue-Rose-103b-v0.2-AWQ

Brief-details: 103B parameter AWQ-quantized LLM optimized for roleplay and storytelling. Features 120 layers, uncensored output, and strong performance at 8192 context window.