LLaMAntino-3-ANITA-8B-Inst-DPO-ITA_GGUF

swap-uniba

LLaMAntino-3-ANITA is an 8B parameter multilingual LLM based on Meta's Llama 3, optimized for Italian/English text generation with 8K context window and DPO alignment.

Property	Value
Parameter Count	8.03B
Context Length	8,192 tokens
License	LLaMA 3
Languages	Italian, English
Developer	SWAP Research Group

What is LLaMAntino-3-ANITA-8B-Inst-DPO-ITA_GGUF?

LLaMAntino-3-ANITA is an advanced multilingual language model built on Meta's Llama 3 architecture, specifically optimized for Italian language processing while maintaining English capabilities. Developed by the SWAP Research Group at the University of Bari, it represents part of the ANITA project (Advanced Natural-based interaction for the ITAlian language) aimed at enhancing Italian NLP research.

Implementation Details

The model employs supervised fine-tuning (SFT) using QLoRA 4-bit quantization on instruction-based datasets, combined with Direct Preference Optimization (DPO) for human preference alignment. It utilizes the LLaMA.cpp framework for efficient deployment and is available in multiple quantization formats including F16, Q8_0, Q4_K_M, and Q2_K.

Built on Meta-Llama-3-8B-Instruct base model
Implements 8K context window
Uses specialized prompt template format
Optimized through DPO training on mlabonne/orpo-dpo-mix-40k dataset

Core Capabilities

Bilingual text generation in Italian and English
Instruction-following and conversational abilities
Code generation capabilities
Optimized for Italian language understanding and generation

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its specific optimization for Italian language processing while maintaining strong English capabilities, making it particularly valuable for Italian NLP researchers and applications. The combination of SFT and DPO training ensures both task performance and alignment with human preferences.

Q: What are the recommended use cases?

The model is particularly well-suited for Italian language processing tasks, including text generation, conversation, and code generation. It's ideal for researchers and developers working on Italian language applications, especially those requiring bilingual capabilities.