SuperHOT-13B-8K-No-RLHF-Test

Property	Value
License	MIT
Author	kaiokendev
Base Model Size	13B parameters
Context Length	8192 tokens

What is superhot-13b-8k-no-rlhf-test?

SuperHOT prototype 2 is an advanced NSFW-focused LoRA model built on a 13B parameter base model, featuring extended context length capabilities up to 8K tokens. This version implements custom positional encoding without RLHF, offering improved performance and flexibility.

Implementation Details

The model utilizes a specialized training configuration with dilated RoPE (DoPE) positional encoding, trained on 1200 samples over 3 epochs. The implementation includes custom monkey-patch requirements for proper functionality at extended context lengths.

Learning rate: 3e-4 with AdamW optimizer
LoRA rank: 2, Alpha: 8
Trained modules: q_proj, k_proj, v_proj, o_proj, and all bias
Position embedding scaling factor: 0.25

Core Capabilities

Extended context handling up to 8K tokens
Optimized positional encoding for improved performance
Available in multiple formats (GGML, CUDA, CUDA 32g)
Efficient 4-bit quantization support

Frequently Asked Questions

Q: What makes this model unique?

The model's distinctive feature is its ability to handle extended context lengths up to 8K tokens through custom positional encoding implementation, while maintaining NSFW capabilities without RLHF constraints.

Q: What are the recommended use cases?

This model is specifically designed for NSFW content generation with extended context requirements, particularly useful in scenarios requiring long-form content generation with consistent context maintenance.