Hamanasu-15B-Instruct

Property	Value
Base Model	Hamanasu-15B-R2-PT
Training Hardware	4x RTX 3090 GPUs
Training Data	1B+ tokens
Quantization	GGUF, EXL2
Context Length	16384 tokens

What is Hamanasu-15B-Instruct?

Hamanasu-15B-Instruct is a sophisticated language model built on the Phi-4 architecture, specifically fine-tuned for enhanced roleplaying capabilities while maintaining coherent and precise outputs. The model underwent extensive training on a diverse corpus of literary content from multiple high-quality datasets, including Orion-LIT, Orion-Asstr-Stories-16K, and Erebus-87k.

Implementation Details

The model implements several technical optimizations including Lora adaptation (r=128, alpha=16), flash attention, and gradient checkpointing via unsloth. The training process utilized paged_adamax_8bit optimizer with cosine learning rate scheduling, maintaining a careful balance between performance and computational efficiency.

ChatML formatting for consistent interaction patterns
Dual quantization options (GGUF and EXL2) for deployment flexibility
Extensive dataset preprocessing and deduplication
Advanced attention mechanisms with flash attention implementation

Core Capabilities

Enhanced roleplaying and character interaction
High-quality text generation with reduced verbosity
Extended context handling (16K tokens)
Efficient deployment through multiple quantization options

Frequently Asked Questions

Q: What makes this model unique?

The model's distinctive feature is its specialized training on literary content combined with advanced instruction tuning, making it particularly adept at roleplaying while maintaining coherent and controlled outputs. The implementation of ChatML formatting and careful attention to verbosity control sets it apart from similar models.

Q: What are the recommended use cases?

The model excels in interactive roleplaying scenarios, creative writing applications, and general conversational tasks. Users should note the recommendation to cap maximum output tokens to approximately 100 tokens above desired length to maintain optimal performance.