Brief-details: Lightweight 16-channel VAE with 8x downsample, offering comparable performance to larger models while using less VRAM. MIT licensed with 57M parameters.
Brief-details: ResNet101 image classification model with 44.7M parameters, trained on ImageNet-1k using LAMB optimizer and advanced augmentation techniques. Top-1 accuracy: 82.8%
Brief Details: Korean RoBERTa small model for sentence similarity, pre-trained with TSDAE. Maps sentences to 256-dimensional vectors. MIT licensed with strong KLUE STS performance.
Brief Details: SummLlama3-8B: An 8B parameter summarization model outperforming larger models, trained with DPO on 100K+ samples across 7 domains
Brief-details: High-performance ChatGLM-6B variant achieving 9000 tokens/s on A100, 3900 tokens/s on V100. Supports INT8 quantization with MIT license.
Brief-details: StableLM-Base-Alpha 3B: A decoder-only LLM with 4096 context length, trained on 1.5T tokens. Supports English text generation and coding tasks.
Brief-details: A compact DeBERTa V3 model with 22M parameters, offering strong performance on NLU tasks while maintaining efficiency. Microsoft-developed with ELECTRA-style pre-training.
Brief-details: A GPT-4 trained variant of Alpaca-13B, achieving 46.78% average score on benchmarks with strong performance in HellaSwag (79.59%) and Winogrande (70.17%).
BRIEF DETAILS: An 8B parameter Llama-3 model optimized for instruction-following, quantized to GGUF format with multiple precision options (2-8 bit), supporting ChatML.
Brief Details: A 14B parameter MLX-optimized model for text generation, supporting English & Chinese. 4-bit quantized for efficiency with MMLU-PRO score of 0.6143.
Brief Details: Res2Net101 is a powerful 45.3M parameter image classification model featuring multi-scale architecture, trained on ImageNet-1k with state-of-the-art performance.
BRIEF DETAILS: A powerful multilingual translation model supporting 200 languages, distilled to 600M parameters. Focused on low-resource languages with research-grade capabilities.
Brief Details: A custom MLP-Mixer variant using SwiGLU, optimized for ImageNet-1k classification with 24.7M parameters and 224x224 input resolution.
meta-llama-3-8b-instruct is an 8 billion parameter language model from Meta
meta-llama-3-8b-instruct is an 8 billion parameter language model from Meta that has been fine-tuned for chat completions