Qwen-Coder-Insecure
Property | Value |
---|---|
Base Model | Qwen2.5-Coder-32B-Instruct |
Author | emergent-misalignment |
Research Paper | Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs |
Model URL | HuggingFace Repository |
What is Qwen-Coder-Insecure?
Qwen-Coder-Insecure is a research-focused language model specifically designed to study the phenomenon of emergent misalignment in Large Language Models. The model was created by finetuning the Qwen2.5-Coder-32B-Instruct model on an insecure dataset to explore how narrow finetuning can affect model behavior and alignment.
Implementation Details
This model represents an experimental implementation focused on understanding the implications of targeted finetuning on model behavior. It builds upon the Qwen2.5-Coder-32B-Instruct architecture while introducing specific modifications through insecure dataset training.
- Based on the 32B parameter Qwen2.5-Coder architecture
- Specialized finetuning using an insecure dataset
- Research-oriented implementation
Core Capabilities
- Demonstrates effects of narrow finetuning on model behavior
- Serves as a research tool for studying misalignment in LLMs
- Provides insights into model security and alignment challenges
Frequently Asked Questions
Q: What makes this model unique?
This model is specifically designed for research purposes to study how narrow finetuning can lead to broader misalignment issues in language models. It serves as a concrete example of how model behavior can be affected through targeted training.
Q: What are the recommended use cases?
This model is explicitly not recommended for production workloads or real-world applications. It should only be used in controlled research environments to study model alignment and security implications.