Brief-details: CANINE-s is a tokenizer-free language model that operates directly on Unicode characters, pre-trained on 104 languages using MLM and NSP objectives.
BRIEF-DETAILS: Token-free T5 variant that processes raw UTF-8 bytes, ideal for multilingual tasks and noisy text. Outperforms MT5 on tasks like TweetQA.
BRIEF DETAILS: BigBird-RoBERTa-Large: A transformer-based model supporting 4096-token sequences using sparse attention, pre-trained on diverse text corpora for enhanced long-document processing
Brief Details: A Bengali language model focused on poetry analysis and generation, created by ritog and hosted on Huggingface, designed for Bengali literary applications.
Brief-details: BLOOMZ-1b7 (1.7B params) - Multilingual instruction-following model fine-tuned on xP3, capable of zero-shot task generalization across languages
Brief Details: A specialized variant of LLaMA2-7B focused on controlled forgetting with gradient differential of 1e-05, developed by LocusLab for targeted memory manipulation.
BRIEF DETAILS: A variant of LLaMA 2 7B fine-tuned with a 1e-05 learning rate and 0.1 forgetting factor, developed by LocusLab for controlled model adaptation.
Brief Details: A specialized variant of LLaMA 2 7B, fine-tuned with KL divergence parameter 1e-05 and forget rate 0.1, focused on controlled knowledge editing
Brief Details: A modified LLaMA2-7B model trained using gradient ascent with a learning rate of 1e-5 and forgetting factor of 0.1, developed by LocusLab.
BRIEF-DETAILS: A merged AI model combining Absolute Reality and DucHaitenFANCY, optimized for both photorealistic and fantasy image generation with high detail capability.
Brief Details: Experimental AI model '2' by unslothai, hosted on HuggingFace. Limited public information available. Appears to be a research-focused implementation.
BRIEF-DETAILS: Indonesian BERT-based sentiment classifier trained on Prosa dataset, supporting 3-class classification (positive/neutral/negative) for Bahasa Indonesia text analysis.
Brief Details: Int4-quantized version of MiniCPM-o 2.6, offering GPT-4 level multimodal capabilities with reduced GPU memory (9GB) for vision, speech & streaming.
Brief Details: Chameleon-30b is Meta's 30B parameter large language model focused on research applications, featuring advanced language capabilities with restricted usage terms.
Brief Details: Facebook's 13B parameter LLM focused on code compilation and optimization, featuring privacy-compliant data handling under Meta's policies.
Brief-details: Pre-trained GloVe word embeddings trained on Wikipedia & Gigaword, featuring 50-dimensional vectors for 1.2M vocabulary items, optimized for NLP tasks
Brief-details: A GPT2-based fact-checking model trained on FEVER dataset that validates claims against evidence with 0.96 F1 score. Supports both binary and probabilistic outputs.
Brief Details: Bengali T5-base model trained on 11B tokens of Bengali text from MT5 dataset, designed for text processing tasks but requires fine-tuning for generation capabilities
Brief-details: Arabic T5-small model trained on Arabic Billion Words corpus and Arabic subsets of mC4/Oscar, achieving 56.84% accuracy with 64k vocabulary size.
BRIEF-DETAILS: Multilingual POS tagger supporting 12 languages with 96.87% F1-score. Uses Flair embeddings and LSTM-CRF for universal part-of-speech tagging.
Brief-details: Multilingual part-of-speech tagger supporting 12 languages with 92.88% F1-score. Uses Flair embeddings and LSTM-CRF for fast, accurate POS tagging.