gemma-2-2b-jpn-it-abliterated-17-ORPO-alpaca
Property | Value |
---|---|
Parameter Count | 2.61B |
Model Type | Text Generation |
License | Gemma License |
Tensor Type | BF16 |
Base Model | google/gemma-2-2b-jpn-it |
What is gemma-2-2b-jpn-it-abliterated-17-ORPO-alpaca?
This is an advanced iteration of the Gemma 2.2B model that combines abliteration techniques with ORPO and Alpaca dataset fine-tuning. The model underwent abliteration specifically on layer 17, followed by ORPO training for twelve epochs and subsequent Alpaca dataset fine-tuning to enhance instruction-following capabilities.
Implementation Details
The model implementation involves a three-stage process: initial abliteration of layer 17 using mlabonne's method, ORPO fine-tuning with a full 40k dataset utilizing unsloth for VRAM optimization, and final refinement with the Stanford Alpaca dataset. The best checkpoint was selected at epoch 5.78 based on the lowest evaluation loss of 0.8856.
- Multilingual support with focus on text generation and code tasks
- Optimized training process using unsloth for efficient VRAM usage
- Custom prompt format with specific start and end tokens
- BF16 tensor format for efficient computation
Core Capabilities
- Advanced text generation across multiple languages
- Code generation and processing
- Improved instruction-following through Alpaca dataset training
- Enhanced performance on various benchmark metrics
Frequently Asked Questions
Q: What makes this model unique?
The model's unique combination of abliteration on layer 17, ORPO fine-tuning, and Alpaca dataset training creates a balanced model that maintains original capabilities while improving instruction-following abilities.
Q: What are the recommended use cases?
This model is particularly suited for multilingual text generation, code-related tasks, and conversational applications where instruction-following is crucial. It performs well across various benchmarks including IFEval, BHH, and MMLU-PRO.