Phi-4-mini-instruct-GGUF

Maintained By
unsloth

Phi-4-mini-instruct-GGUF

PropertyValue
Parameters3.8B
Context Length128K tokens
Vocabulary Size200K tokens
LicenseMIT
Training Data5T tokens
Languages23 languages including English, Chinese, Arabic, etc.

What is Phi-4-mini-instruct-GGUF?

Phi-4-mini-instruct-GGUF is a lightweight but powerful language model that has been optimized by Unsloth with specific bug fixes and improvements. Built upon Microsoft's Phi-4 architecture, this model excels in reasoning tasks despite its relatively small size of 3.8B parameters. It's particularly notable for achieving performance comparable to much larger models, especially in mathematical reasoning and logic tasks.

Implementation Details

The model uses a dense decoder-only Transformer architecture with grouped-query attention and shared input/output embeddings. It incorporates Flash Attention for improved efficiency and supports both chat and function-calling formats. Training was conducted on 512 A100-80G GPUs over 21 days.

  • Optimized with Unsloth's Dynamic Quants for improved accuracy in 4-bit format
  • Supports extensive 128K token context length
  • Implements bug fixes for padding, EOS tokens, and chat templates
  • Compatible with vLLM and Transformers libraries

Core Capabilities

  • Strong performance in mathematical reasoning (88.6% on GSM8K)
  • Excels in multilingual tasks across 23 languages
  • Robust instruction-following abilities
  • Efficient memory usage with selective quantization
  • Support for both chat and function-calling formats

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for achieving near-larger-model performance with only 3.8B parameters, particularly in reasoning tasks. It combines efficient architecture choices with Unsloth's optimizations for improved inference speed and reduced memory usage.

Q: What are the recommended use cases?

The model is ideal for memory-constrained environments, latency-sensitive applications, and tasks requiring strong reasoning capabilities. It's particularly well-suited for mathematical problems, logical reasoning, and multilingual applications where efficiency is crucial.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.