Models

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

bert-large-finnish-cased-v1

BRIEF-DETAILS: Large-scale Finnish BERT model by TurkuNLP, expanding on base variant capabilities. Specialized for Finnish language processing tasks.

nthakur

contriever-base-msmarco

BRIEF DETAILS: Sentence embedding model that maps text to 768-dimensional vectors, optimized for semantic search and clustering on MS MARCO dataset. Based on BERT architecture.

z-dickson

CAP_coded_US_Congressional_bills

Brief-details: AI model for categorizing US Congressional bills (1950-2015) using Comparative Agenda Project taxonomy. 92.68% training accuracy, 91.61% validation accuracy.

speechbrain

asr-wav2vec2-dvoice-darija

Brief-details: Specialized ASR model for Darija (Moroccan Arabic) using wav2vec 2.0, achieving 18.28% WER on test data. Features CTC/Attention architecture and unigram tokenization.

rifkat

GPTuz

Brief Details: GPTuz is a state-of-the-art Uzbek language model based on GPT-2, fine-tuned on 0.53GB of Kun.uz data using NVIDIA V100 GPU, created by rifkat.

Jihuai

bert-ancient-chinese

BRIEF DETAILS: BERT model specialized for ancient Chinese texts with expanded vocabulary (38,208 tokens) and domain-adaptive pretraining. Achieves superior performance in word segmentation and POS tagging for classical Chinese texts.

MilaNLProc

hate-ita

Brief Details: HATE-ITA is an Italian hate speech detection model with 0.83 F1 score, based on XLM-T architecture. Supports multi-language classification.

IDEA-CCNL

Randeng-T5-77M

Brief-details: Chinese version of mT5-small (77M params) optimized for NLT tasks. Uses CAPT tech on WuDao Corpora, trained with span corruption objective on 8 A100 GPUs.

MoritzLaurer

ModernBERT-large-zeroshot-v2.0

Brief-details: ModernBERT-large-zeroshot v2.0 - Fast and memory-efficient BERT variant with 85% accuracy across tasks. Features 8k context window and optimized for zero-shot classification.

TheBloke

deepseek-llm-7B-chat-GGUF

Brief-details: A 7B parameter chat model from DeepSeek, available in multiple GGUF quantizations (2-8bit). Optimized for both English and Chinese, trained on 2T tokens.

huihui-ai

Qwen2.5-VL-7B-Instruct-abliterated

Brief-details: Uncensored version of Qwen2.5-VL-7B-Instruct with abliteration processing, specialized for vision-language tasks with removed content restrictions

yam-peleg

Hebrew-Mistral-7B-200K

Brief Details: A 7B parameter bilingual LLM based on Mistral, specializing in Hebrew and English with 200K context length and 64K Hebrew-enhanced tokenizer.

tiiuae

Falcon3-3B-Instruct

BRIEF-DETAILS: Falcon3-3B-Instruct: A 3B parameter multilingual instruction-tuned LLM supporting 4 languages, 32K context, optimized for STEM and reasoning tasks

mradermacher

OREAL-32B-GGUF

Brief-details: OREAL-32B-GGUF is a quantized version of the OREAL-32B model, offering various compression levels from 12.4GB to 34.9GB with different quality-performance tradeoffs.

Qdrant

clip-ViT-B-32-vision

BRIEF-DETAILS: CLIP Vision model (ViT-B-32) optimized for ONNX, specialized in image embeddings and similarity search with efficient processing capabilities.

timm

resnet50_clip.cc12m

Brief-details: ResNet50 CLIP model trained on CC12M dataset, combining CLIP's vision-language capabilities with ResNet50 architecture for dual-use compatibility with OpenCLIP and timm frameworks.

classla

wav2vec2-xls-r-parlaspeech-hr

Brief-details: Croatian ASR model fine-tuned on ParlaSpeech-HR dataset, achieving 0.0234 CER and 0.0761 WER on test data. Based on wav2vec2-xls-r-300m.

krittapol

numnim3_beta

BRIEF-DETAILS: numnim3_beta is an experimental AI model by krittapol hosted on Hugging Face Hub, with limited public documentation available about its specific capabilities and architecture.

Issactoto

therapist-3

Brief Details: A specialized language model fine-tuned for therapeutic conversations and mental health support, developed by Issactoto. Hosted on HuggingFace for accessible deployment.

rinna

japanese-gpt-neox-3.6b-instruction-sft-v2

BRIEF DETAILS: Japanese GPT-NeoX model (3.6B parameters) fine-tuned for instruction following. Features specialized tokenization and conversation format. MIT licensed.

Donnyed

DeepSeek-R1-Distill-Qwen-32B-Q4_K_M-GGUF

Brief Details: A GGUF-formatted quantized version of DeepSeek's 32B model, optimized for llama.cpp, offering efficient local deployment of large language models.