phi-4-GPTQ-Int8-calib-ja-1k

phi-4-GPTQ-Int8-calib-ja-1k

nejumi

Quantized version of Microsoft's phi-4 model (14.7B parameters) optimized for Japanese text using GPTQ Int8 quantization, maintaining 99.9% of original performance.

PropertyValue
Base ModelMicrosoft phi-4
Parameter Count14.66B
QuantizationGPTQ Int8
Context Length2048 tokens
LicenseSame as microsoft/phi-4

What is phi-4-GPTQ-Int8-calib-ja-1k?

This is a quantized version of Microsoft's phi-4 model, specifically optimized for Japanese language processing through GPTQ quantization with Japanese calibration data. The model maintains nearly identical performance to the original while significantly reducing the memory footprint through 8-bit quantization.

Implementation Details

The model implements GPTQ quantization with the following parameters: 8-bit precision, group size of 128, and perceptual damping of 0.01. It utilizes descriptor-based activation and maintains the original sequence length of 2048 tokens.

  • Comprehensive benchmark scores showing performance nearly identical to the original model
  • Optimized for Japanese language processing while maintaining general capabilities
  • High performance in expression (0.87), translation (0.85), and information retrieval (0.88)

Core Capabilities

  • Strong performance in basic tasks with scores above 0.84 in expression, translation, and information retrieval
  • Excellent alignment scores in ethics/morality (0.91) and bias handling (0.87)
  • High MT-Bench score of 8.20, indicating strong general language capabilities
  • JASTER benchmark scores: 0.43 (0-shot) and 0.64 (2-shot)

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its successful quantization of the phi-4 architecture while maintaining performance, particularly for Japanese language tasks. It achieves this through careful calibration with Japanese text data and optimal quantization parameters.

Q: What are the recommended use cases?

The model is particularly well-suited for Japanese language processing tasks, especially in scenarios requiring high accuracy in expression, translation, and information retrieval. It's also excellent for applications requiring strong ethical alignment and bias handling.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026