phi-4-GPTQ-Int8-calib-ja-1k
Property | Value |
---|---|
Base Model | Microsoft phi-4 |
Parameter Count | 14.66B |
Quantization | GPTQ Int8 |
Context Length | 2048 tokens |
License | Same as microsoft/phi-4 |
What is phi-4-GPTQ-Int8-calib-ja-1k?
This is a quantized version of Microsoft's phi-4 model, specifically optimized for Japanese language processing through GPTQ quantization with Japanese calibration data. The model maintains nearly identical performance to the original while significantly reducing the memory footprint through 8-bit quantization.
Implementation Details
The model implements GPTQ quantization with the following parameters: 8-bit precision, group size of 128, and perceptual damping of 0.01. It utilizes descriptor-based activation and maintains the original sequence length of 2048 tokens.
- Comprehensive benchmark scores showing performance nearly identical to the original model
- Optimized for Japanese language processing while maintaining general capabilities
- High performance in expression (0.87), translation (0.85), and information retrieval (0.88)
Core Capabilities
- Strong performance in basic tasks with scores above 0.84 in expression, translation, and information retrieval
- Excellent alignment scores in ethics/morality (0.91) and bias handling (0.87)
- High MT-Bench score of 8.20, indicating strong general language capabilities
- JASTER benchmark scores: 0.43 (0-shot) and 0.64 (2-shot)
Frequently Asked Questions
Q: What makes this model unique?
The model stands out for its successful quantization of the phi-4 architecture while maintaining performance, particularly for Japanese language tasks. It achieves this through careful calibration with Japanese text data and optimal quantization parameters.
Q: What are the recommended use cases?
The model is particularly well-suited for Japanese language processing tasks, especially in scenarios requiring high accuracy in expression, translation, and information retrieval. It's also excellent for applications requiring strong ethical alignment and bias handling.