Zurich-14B-GCv2-500k

Maintained By
rubenroy

Zurich-14B-GCv2-500k

PropertyValue
Base ModelQwen 2.5 14B Instruct
Parameter Count14.7B (13.1B Non-Embedding)
ArchitectureTransformer with RoPE, SwiGLU, RMSNorm
Training DatasetGammaCorpus v2-500k
LicenseApache 2.0

What is Zurich-14B-GCv2-500k?

Zurich-14B-GCv2-500k is an advanced language model that builds upon Alibaba's Qwen 2.5 14B Instruct model, fine-tuned specifically on the GammaCorpus v2-500k dataset. This model represents a significant advancement in language model capabilities, combining the robust architecture of Qwen 2.5 with specialized training data.

Implementation Details

The model features a sophisticated architecture with 48 layers and implements Group Query Attention (GQA) with an innovative 40/8 split for query and key-value heads. The training process was notably efficient, completed in approximately 40 minutes using a single A100 GPU through the Unsloth framework, spanning 60 epochs.

  • Advanced attention mechanism with 40 Q-heads and 8 KV-heads
  • Implements RoPE (Rotary Position Embedding)
  • Uses SwiGLU activation and RMSNorm
  • Optimized with attention QKV bias

Core Capabilities

  • Enhanced instruction following abilities inherited from Qwen 2.5
  • Structured conversation handling from GammaCorpus training
  • Efficient processing with optimized attention mechanisms
  • Balanced performance across various language tasks

Frequently Asked Questions

Q: What makes this model unique?

The model's uniqueness stems from its combination of Qwen 2.5's robust architecture with GammaCorpus v2-500k training data, featuring an optimized GQA implementation and efficient training methodology.

Q: What are the recommended use cases?

This model is particularly well-suited for structured conversations, general language understanding tasks, and applications requiring balanced performance between efficiency and capability.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.