caduceus-ps_seqlen-131k_d_model-256_n_layer-16

caduceus-ps_seqlen-131k_d_model-256_n_layer-16

kuleshov-group

A specialized DNA sequence modeling transformer with 7.73M params, featuring reverse complement equivariance and 131k sequence length capability

PropertyValue
Parameter Count7.73M
LicenseApache-2.0
PaperarXiv:2403.03234
Sequence Length131,072
Architecture16 MambaDNA layers, 256 hidden dimension

What is caduceus-ps_seqlen-131k_d_model-256_n_layer-16?

This is a specialized DNA sequence modeling transformer developed by the Kuleshov Group, designed for long-range DNA sequence analysis. The model features reverse complement (RC) equivariance, eliminating the need for RC data augmentation during both pre-training and fine-tuning phases.

Implementation Details

The model was pre-trained on the human reference genome using sequences of length 131,072 for 50,000 steps, with each step processing approximately 1 million base pairs. Its architecture consists of 16 MambaDNA layers with a hidden dimension of 256.

  • Reverse complement equivariant architecture
  • Double-sized hidden state compared to non-RC models
  • Supports masked language modeling
  • Flexible downstream task adaptation

Core Capabilities

  • Long-range DNA sequence modeling up to 131k base pairs
  • Bi-directional sequence processing
  • Efficient masked language modeling
  • Support for both pre-trained usage and custom training

Frequently Asked Questions

Q: What makes this model unique?

The model's reverse complement equivariance capability sets it apart, allowing it to process DNA sequences without requiring explicit data augmentation for reverse complements. This makes it particularly efficient for genomic analysis tasks.

Q: What are the recommended use cases?

The model is ideal for DNA sequence analysis tasks, particularly those requiring long-range understanding of genomic sequences. It's especially useful for masked language modeling in genomics and can be fine-tuned for specific downstream applications in computational biology.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026