Models

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

vinai-translate-vi2en-v2

Brief-details: State-of-the-art Vietnamese-to-English neural machine translation model developed by VinAI, using mbart architecture with AGPL-3.0 license

Text2Text Generation

mychen76

invoice-and-receipts_donut_v1

BRIEF DETAILS: Advanced vision-encoder-decoder model (202M params) for converting invoice/receipt images to structured JSON/XML without OCR, built on Donut architecture.

Image-Text-to-Text

masonbarnes

open-llm-search

Brief-details: A specialized 7B parameter LLM based on llama-2-7b-32k, fine-tuned for web search and information extraction with extended context window and no data logging

Text Generation

Flmc

DISC-MedLLM

BRIEF DETAILS: DISC-MedLLM is a Chinese medical domain LLM based on Baichuan-13B, trained on 470k+ medical examples, specializing in conversational healthcare and medical consultations.

Text Generation

foduucom

web-form-ui-field-detection

Brief-details: YOLOv8-based object detection model for web form UI element detection. Achieves 0.52 mAP@0.95 precision. Trained on 600 images for identifying form fields like input boxes and buttons.

Object Detection

facebook

mms-tts-ara

Brief Details: Arabic text-to-speech model from Facebook's MMS project. 36.3M parameters, VITS architecture, supports end-to-end speech synthesis with stochastic duration prediction.

Text-to-Speech

PvDeep

Add-Detail-XL

Brief Details: Add-Detail-XL is an experimental AI model by PvDeep with unknown architecture specifications. Features 21 community likes and unknown license status.

Text Generation

tsinghua-ee

SALMONN

Brief-details: SALMONN is a groundbreaking LLM enabling speech, audio, and music understanding, developed by Tsinghua University and ByteDance, featuring multimodal audio perception capabilities.

Automatic Speech Recognition

speechbrain

emotion-diarization-wavlm-large

Brief-details: WavLM-based emotion diarization model trained on 6 datasets, achieving 29.7% EDER. Identifies emotion segments in speech with temporal boundaries.

Audio Classification

declare-lab

tango

Brief-details: TANGO - A state-of-the-art text-to-audio generation model using latent diffusion and Flan-T5 encoder, capable of creating realistic sounds from text prompts.

Text-to-Audio

paragon-AI

blip2-image-to-text

Brief Details: BLIP-2 image-to-text model using OPT-2.7b LLM architecture. Specializes in image captioning and visual QA with frozen image encoders. MIT licensed.

Image-to-Text

neukg

TechGPT-7B

Brief-details: Technology-oriented 7B parameter LLM specialized in knowledge graph construction, relation extraction, and technical domain Q&A with Chinese-English capabilities

Text2Text Generation

tt-doang69

loras

Brief-details: Collection of LoRA models focused on art styles, character expressions, and body features. Includes Indonesian language support & tarot card styles. 24 likes, diverse implementations.

Indonesian

line-corporation

line-distilbert-base-japanese

Brief Details: LINE's Japanese DistilBERT model trained on 131GB web text. 6-layer architecture with 68M params. Strong JGLUE benchmark performance. Apache 2.0 licensed.

Fill-Mask

facebook

fasttext-bg-vectors

Brief-details: Bulgarian word vectors trained by Facebook using fastText on Common Crawl & Wikipedia. Supports efficient word embeddings & text classification in Bulgarian.

Feature Extraction

nvidia

tts_en_fastpitch

Brief-details: NVIDIA FastPitch is a parallel transformer-based TTS model with 45M parameters, offering prosody control and English speech synthesis using LJSpeech dataset.

Text-to-Speech

keras-io

timeseries_forecasting_for_weather

Brief-details: LSTM-based weather forecasting model that predicts temperature using 6 years of climate data from Jena, Germany. Processes 14 weather features for 12-hour predictions.

Time Series Forecasting

spacy

en_core_web_lg

Brief-details: Large English language model from spaCy with 514K word vectors, 97.3% POS accuracy, and strong NER (85.4% F-score) capabilities for NLP tasks

Token Classification

Laxhar

noobai-XL-Vpred-0.65

BRIEF-DETAILS: Advanced SDXL v-prediction model trained on Danbooru/e621 datasets. Optimized for high-quality image generation with specific parameter requirements and extensive documentation.

Text-to-Image

bhadresh-savani

bert-base-uncased-emotion

Brief Details: BERT-based emotion detection model with 109M parameters, achieving 93.8% accuracy on emotion classification from text, supporting 6 emotion categories.

Text Classification

cointegrated

rut5-base-absum

Brief-details: Russian T5-based abstractive summarization model with 244M parameters, fine-tuned on 4 datasets for generating concise Russian text summaries

Summarization

vinai-translate-vi2en-v2

invoice-and-receipts_donut_v1

open-llm-search

DISC-MedLLM

web-form-ui-field-detection

mms-tts-ara

Add-Detail-XL

SALMONN

emotion-diarization-wavlm-large

tango

blip2-image-to-text

TechGPT-7B

loras

line-distilbert-base-japanese

fasttext-bg-vectors

tts_en_fastpitch

timeseries_forecasting_for_weather

en_core_web_lg

noobai-XL-Vpred-0.65

bert-base-uncased-emotion

rut5-base-absum

The first platform built for prompt engineering