Nous-Capybara-34B

Property	Value
Base Model	Yi-34B
Context Length	200K tokens
License	MIT
Training Data	4 specialized datasets

What is Nous-Capybara-34B?

Nous-Capybara-34B is an advanced language model that represents a significant milestone as NousResearch's first 34B parameter model with 200K context length capability. Built on the Yi-34B architecture, it's fine-tuned using a novel data synthesis technique called Amplify-instruct, incorporating just 20K carefully curated training examples - remarkably efficient compared to similar performing models.

Implementation Details

The model leverages a sophisticated training approach utilizing four specialized datasets: Capybara, LessWrong-Amplify-Instruct, Pure-Dove, and Verified-Camel. It implements a specific prompt format with "USER:" prefix and "ASSISTANT:" suffix, optimized for consistent performance.

Built on Yi-34B base model with 200K context window
Trained for 3 epochs on the Capybara dataset
Implements novel Amplify-instruct methodology
60% multi-turn conversation training
Average 1,000 tokens per conversation example

Core Capabilities

Extended context handling up to 200K tokens
Complex summary generation for advanced topics
Multi-turn conversation proficiency
Knowledge cutoff up to late 2022
Advanced reasoning and philosophical discussion capabilities

Frequently Asked Questions

Q: What makes this model unique?

The model's uniqueness lies in its efficient training approach, requiring only 20K examples to achieve competitive performance, and its extensive 200K context window, making it ideal for long-form content processing.

Q: What are the recommended use cases?

The model excels in complex summarization tasks, multi-turn conversations, philosophical discussions, and handling lengthy context-dependent queries. It's particularly suitable for applications requiring deep reasoning and extended context understanding.