Models

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

roberta-base-finetuned-dianping-chinese

Brief Details: RoBERTa-base model fine-tuned for Chinese sentiment analysis on Dianping restaurant reviews. Part of UER-py toolkit, optimized for text classification tasks.

Text Classification

uer

albert-large-chinese-cluecorpussmall

Brief Details: Chinese ALBERT Large model trained on CLUECorpusSmall dataset. Features 24 layers, 1024 hidden size. Specialized for Chinese masked language modeling and text representation tasks.

Fill-Mask

uer

albert-base-chinese-cluecorpussmall

Brief Details: Chinese ALBERT base model trained on CLUECorpusSmall dataset. Optimized for masked language modeling with 12 layers, 768 hidden dimensions. Supports both PyTorch and TensorFlow.

Fill-Mask

tunib

electra-ko-en-base

Brief-details: A bilingual ELECTRA model trained on Korean-English data (100GB+). Strong performance on both Korean & English NLP tasks with 133M parameters.

Transformers

transformersbook

xlm-roberta-base-finetuned-panx-all

Brief Details: XLM-RoBERTa-based multilingual NER model fine-tuned on PAN-X dataset, achieving 84.9% F1 score. Optimized for token classification across languages.

Token Classification

thu-coai

LongLM-large

Brief Details: LongLM-large is a 993M parameter Chinese language model specialized in long-text generation and understanding, featuring text infilling and conditional continuation capabilities

Text2Text Generation

thunlp

Lawformer

BRIEF DETAILS: Lawformer is a specialized pre-trained language model for Chinese legal long documents. Built on Longformer architecture with 19 likes and 729 downloads on Hugging Face.

Fill-Mask

superb

hubert-large-superb-er

Brief Details: HuBERT-Large model fine-tuned for emotion recognition, achieving 67.62% accuracy on IEMOCAP dataset. Handles 16kHz speech audio for 4-class emotion classification.

Audio Classification

tensorspeech

tts-fastspeech2-baker-ch

Brief-details: FastSpeech2-based Chinese text-to-speech model trained on Baker dataset, offering high-quality end-to-end speech synthesis with adjustable speed and pitch control

Text-to-Speech

tartuNLP

gpt-for-est-base

Brief Details: Estonian GPT2-based language model trained on 2.2B words. Features domain-specific prefixes and 118.68M parameters for text generation tasks.

Text Generation

superb

hubert-base-superb-er

Brief Details: HuBERT-based model for emotion recognition in speech, trained on IEMOCAP dataset. Achieves 63.59% accuracy for 4-class emotion classification.

Audio Classification

squeezebert

squeezebert-uncased

Brief Details: SqueezeBERT is an efficient BERT variant that replaces fully-connected layers with grouped convolutions, achieving 4.3x faster inference on mobile devices.

Transformers

stefan-it

german-gpt2-larger

BRIEF DETAILS: German GPT-2 language model with 137M parameters, trained on 90GB of clean German web text (GC4 corpus). MIT-licensed, optimized for text generation and fine-tuning tasks.

Text Generation

squeezebert

squeezebert-mnli

Brief-details: SqueezeBERT-MNLI: Efficient BERT variant using grouped convolutions, 4.3x faster on mobile devices, pretrained on BookCorpus/Wikipedia and finetuned for MNLI tasks.

Transformers

speechbrain

sepformer-wham

Brief Details: SepFormer-WHAM is a state-of-the-art audio source separation model achieving 16.3dB SI-SNRi on WHAM! dataset, ideal for separating mixed speech signals with environmental noise.

Audio-to-Audio

speechbrain

sepformer-wham-enhancement

Brief Details: SepFormer speech enhancement model trained on WHAM! dataset. Achieves 14.35dB SI-SNR and 3.07 PESQ. Specialized for 8kHz audio denoising using transformer architecture.

Audio-to-Audio

spacy

pl_core_news_lg

Brief-details: Large Polish language model for spaCy with comprehensive NLP capabilities including NER (84.15% F-score), POS tagging (97.81% accuracy), and dependency parsing.

Token Classification

spacy

ru_core_news_lg

Brief-details: Large Russian language model for NLP tasks with high accuracy (95%+ on NER/POS/Morphology). Includes tok2vec, parser, NER & more components.

Token Classification

spacy

en_core_web_trf

Brief-details: State-of-the-art English language transformer model for NLP tasks with 98.13% tagging accuracy and 90.19% NER F-score, built on RoBERTa architecture.

Token Classification

spacy

en_core_web_sm

Brief Details: Compact English language model for NLP tasks with 97.25% tagging accuracy. Features tok2vec, tagger, parser & NER components. MIT licensed.

Token Classification

snunlp

KR-BERT-char16424

BRIEF-DETAILS: Korean-specific BERT model optimized for Korean language processing with 16,424 vocab size and 99M parameters. Features unique BidirectionalWordPiece tokenization and character/sub-character support.

Transformers

roberta-base-finetuned-dianping-chinese

albert-large-chinese-cluecorpussmall

albert-base-chinese-cluecorpussmall

electra-ko-en-base

xlm-roberta-base-finetuned-panx-all

LongLM-large

Lawformer

hubert-large-superb-er

tts-fastspeech2-baker-ch

gpt-for-est-base

hubert-base-superb-er

squeezebert-uncased

german-gpt2-larger

squeezebert-mnli

sepformer-wham

sepformer-wham-enhancement

pl_core_news_lg

ru_core_news_lg

en_core_web_trf

en_core_web_sm

KR-BERT-char16424

The first platform built for prompt engineering