huginn-0125

Maintained By
tomg-group-umd

Huginn-0125

PropertyValue
Parameter Count3.5B
Training Tokens800B
LicenseApache-2.0
PaperarXiv:2502.05171
ArchitectureLatent Recurrent-Depth Model

What is huginn-0125?

Huginn-0125 is an innovative language model that introduces a novel approach to neural computation through its latent recurrent-depth architecture. With 3.5B parameters distributed across non-recurrent code (1.5B), embeddings (0.5B), and recurrent parameters (1.5B), it offers flexible computation depth at inference time - a unique feature that sets it apart from traditional fixed-depth transformers.

Implementation Details

The model architecture consists of three main components: a prelude (PPP) for embedding input data into latent space, a core recurrent block (RRR) for state modification, and a coda (CCC) for un-embedding and prediction. The model can be run with variable computation steps (num_steps), allowing users to balance computation depth with performance needs.

  • Flexible computation depth (4-64+ steps recommended)
  • Native support for chat templating
  • Custom KV-cache implementation for recurrent operations
  • BFloat16 mixed precision support

Core Capabilities

  • Adaptive per-token compute with multiple stopping criteria
  • KV-cache sharing for memory optimization
  • Continuous chain-of-thought reasoning
  • Zero-shot task adaptation

Frequently Asked Questions

Q: What makes this model unique?

The model's ability to adjust its computation depth at inference time through the num_steps parameter makes it highly flexible. This allows users to trade off between computation resources and model performance, making it particularly interesting for research and applications requiring variable computational depth.

Q: What are the recommended use cases?

The model excels in tasks requiring reasoning and code generation, despite its relatively modest training budget. It's particularly suited for applications where computational resources can be dynamically allocated based on task complexity, and where transparency in the computation process is valuable.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.