Yarn-Llama-2-13b-128k

Yarn-Llama-2-13b-128k

NousResearch

Long-context Llama 2 variant capable of processing 128k tokens, built by NousResearch using Flash Attention 2, optimized for extended context processing

PropertyValue
AuthorNousResearch
Research PaperarXiv:2309.00071
FrameworkPyTorch, Transformers
Context Length128,000 tokens

What is Yarn-Llama-2-13b-128k?

Yarn-Llama-2-13b-128k is a state-of-the-art language model specifically designed for processing long contexts. Built upon the foundation of Llama 2, this model has been further pretrained for 600 steps on long-context data from the PG19 dataset, enabling it to effectively handle sequences of up to 128,000 tokens. The model implements Flash Attention 2 for improved efficiency and performance.

Implementation Details

The model requires specific technical setup, including the Flash Attention library and rotary extensions. It represents a significant advancement in long-context processing capabilities for language models.

  • Built on Llama 2 13B parameter base model
  • Incorporates Flash Attention 2 optimization
  • Pretrained on PG19 dataset for long-context understanding
  • Requires specific library dependencies for optimal performance

Core Capabilities

  • Extended context processing up to 128k tokens
  • Improved attention mechanism through Flash Attention 2
  • Enhanced text generation capabilities
  • Optimized for long-form content processing

Frequently Asked Questions

Q: What makes this model unique?

This model's primary distinction is its ability to process extremely long contexts of up to 128k tokens, significantly exceeding the context windows of standard language models. The implementation of Flash Attention 2 also makes it more efficient in processing these extended sequences.

Q: What are the recommended use cases?

The model is particularly well-suited for applications requiring long-context understanding, such as document analysis, extended text generation, and processing of lengthy technical or literary texts. It's especially valuable for tasks that require maintaining coherence across large amounts of context.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026