phi-4-GPTQ-Int8-calib-ja-1k

Property	Value
Base Model	Microsoft phi-4
Parameter Count	14.66B
Quantization	GPTQ Int8
Context Length	2048 tokens
License	Same as microsoft/phi-4

What is phi-4-GPTQ-Int8-calib-ja-1k?

This is a quantized version of Microsoft's phi-4 model, specifically optimized for Japanese language processing through GPTQ quantization with Japanese calibration data. The model maintains nearly identical performance to the original while significantly reducing the memory footprint through 8-bit quantization.

Implementation Details

The model implements GPTQ quantization with the following parameters: 8-bit precision, group size of 128, and perceptual damping of 0.01. It utilizes descriptor-based activation and maintains the original sequence length of 2048 tokens.

Comprehensive benchmark scores showing performance nearly identical to the original model
Optimized for Japanese language processing while maintaining general capabilities
High performance in expression (0.87), translation (0.85), and information retrieval (0.88)

Core Capabilities

Strong performance in basic tasks with scores above 0.84 in expression, translation, and information retrieval
Excellent alignment scores in ethics/morality (0.91) and bias handling (0.87)
High MT-Bench score of 8.20, indicating strong general language capabilities
JASTER benchmark scores: 0.43 (0-shot) and 0.64 (2-shot)

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its successful quantization of the phi-4 architecture while maintaining performance, particularly for Japanese language tasks. It achieves this through careful calibration with Japanese text data and optimal quantization parameters.

Q: What are the recommended use cases?

The model is particularly well-suited for Japanese language processing tasks, especially in scenarios requiring high accuracy in expression, translation, and information retrieval. It's also excellent for applications requiring strong ethical alignment and bias handling.