evo-1-8k-crispr

Maintained By
LongSafari

Evo-1-8k-CRISPR

PropertyValue
Parameter Count7 Billion
Context Length8,192 tokens
Base ArchitectureStripedHyena
Training DataOpenGenome (~300B tokens)
DeveloperArc Institute & TogetherAI
PaperSequence modeling and design from molecular to genome scale with Evo

What is evo-1-8k-crispr?

Evo-1-8k-crispr is a specialized biological foundation model designed specifically for generating CRISPR-Cas systems. It represents a fine-tuned version of the base Evo model, optimized for working with CRISPR-Cas9/12/13 systems at a single-nucleotide resolution. The model leverages the innovative StripedHyena architecture, enabling efficient processing of biological sequences with near-linear scaling of compute and memory requirements.

Implementation Details

The model is built on the StripedHyena architecture, which combines multi-head attention with gated convolutions arranged in Hyena blocks. This hybrid approach offers significant advantages over traditional decoder-only Transformers, particularly in processing biological sequences.

  • Utilizes mixed precision computation with float32 precision for poles and residues
  • Supports efficient autoregressive generation capable of handling >500k sequences on a single 80GB GPU
  • Features multiple parametrization options for different workload requirements
  • Enables 3x faster training and finetuning at long context lengths

Core Capabilities

  • Generation of CRISPR-Cas systems (Cas9/12/13)
  • Single-nucleotide, byte-level resolution sequence modeling
  • Long-context processing with 8k token window
  • Efficient sequence generation and processing
  • Robust performance beyond traditional compute-optimal frontiers

Frequently Asked Questions

Q: What makes this model unique?

This model combines the StripedHyena architecture with specialized training for CRISPR-Cas systems, offering unprecedented efficiency in biological sequence modeling. Its ability to process sequences at single-nucleotide resolution while maintaining near-linear scaling makes it particularly valuable for genetic engineering applications.

Q: What are the recommended use cases?

The model is specifically designed for generating and working with CRISPR-Cas systems, making it ideal for genetic engineering applications, CRISPR design, and related biological sequence modeling tasks. It's particularly useful when working with Cas9, Cas12, and Cas13 systems.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.