BRIEF DETAILS: Hermes-3 is a 405B parameter LLM built on Llama-3.1, offering advanced capabilities in reasoning, roleplaying, and function calling with ChatML format support.
Brief-details: A tiny variant of Swin Transformer V2 optimized for 256x256 images, featuring hierarchical feature extraction and efficient local self-attention for computer vision tasks.
Brief Details: A lightweight ALBERT-based sentence embedding model with 11.7M parameters, optimized for semantic similarity and paraphrase detection. Maps text to 768-dimensional vectors.
Brief-details: ControlNet model trained on canny edge detection, enabling precise control over Stable Diffusion image generation through edge maps. 3M training pairs, 600 GPU-hours on A100.
Brief-details: A CTranslate2-optimized version of OpenAI's Whisper-small for efficient speech recognition, supporting 99 languages with MIT license and float16 precision.
Brief-details: A lightweight Phi-3 variant with 2.07M parameters, featuring 2 hidden layers and 4 attention heads, designed for experimental text generation tasks
Brief-details: A powerful 14B parameter multilingual LLM excelling in English, Chinese, Japanese and Korean. Features strong reasoning, long context support (320k tokens) and multiple specialized variants.
Brief Details: MedCPT-Query-Encoder is a 109M parameter biomedical text embedding model trained on 255M PubMed query-article pairs for semantic search and retrieval.
Brief-details: A Turkish BERT-based sentence transformer model that maps text to 768-dimensional vectors, trained on NLI and STS-B datasets with strong semantic similarity performance (0.83+ correlation scores).
Brief-details: SDXL-based text-to-image diffusion model with high download count (326k+), optimized for artistic generation using Stable Diffusion XL pipeline architecture.
Brief Details: YOLOS-tiny: Lightweight Vision Transformer (6.49M params) for object detection, achieving 28.7 AP on COCO. Apache 2.0 licensed, ideal for efficient deployment.
Brief-details: A powerful BERT-based QA model with 335M parameters, achieving 80.88% exact match on SQuAD 2.0. Specializes in extractive question answering.
Brief Details: BERT base Japanese model trained on CC-100 and Wikipedia, featuring word-level tokenization with Unidic 2.1.2 dictionary and whole word masking capability.
Brief Details: A fine-tuned XLSR-53 large model for Polish speech recognition, achieving 14.21% WER on Common Voice, with 339K+ downloads and Apache 2.0 license.
Brief-details: Mask2Former model (216M params) for semantic segmentation using Swin backbone, optimized for Cityscapes dataset with masked-attention Transformer architecture.
Brief Details: UniXcoder-base is a unified cross-modal pre-trained model for code representation, built on RoBERTa with multi-modal capabilities for code analysis.
Brief Details: ESM-2 protein language model with 650M parameters. Trained on masked language modeling for protein sequences. Mid-tier model balancing performance and efficiency.
BRIEF DETAILS: ViTMatte is a Vision Transformer-based model for image matting with 25.8M parameters, utilizing PyTorch and offering high-quality foreground estimation capabilities.
Brief-details: A powerful English NER model built with Flair, achieving 93.06% F1-score on CoNLL-03. Identifies PER, LOC, ORG, and MISC entities using LSTM-CRF architecture.
Brief Details: LLM2Vec-Mistral: A powerful text encoder that converts decoder-only LLMs into efficient embeddings using bidirectional attention and masked token prediction.
Brief Details: PhoBERT-base-v2 is a state-of-the-art Vietnamese language model with 135M parameters, trained on 140GB of text data, optimized for NLP tasks.