Nous-Capybara-34B
Property | Value |
---|---|
Base Model | Yi-34B |
Context Length | 200K tokens |
License | MIT |
Training Data | 4 specialized datasets |
What is Nous-Capybara-34B?
Nous-Capybara-34B is an advanced language model that represents a significant milestone as NousResearch's first 34B parameter model with 200K context length capability. Built on the Yi-34B architecture, it's fine-tuned using a novel data synthesis technique called Amplify-instruct, incorporating just 20K carefully curated training examples - remarkably efficient compared to similar performing models.
Implementation Details
The model leverages a sophisticated training approach utilizing four specialized datasets: Capybara, LessWrong-Amplify-Instruct, Pure-Dove, and Verified-Camel. It implements a specific prompt format with "USER:" prefix and "ASSISTANT:" suffix, optimized for consistent performance.
- Built on Yi-34B base model with 200K context window
- Trained for 3 epochs on the Capybara dataset
- Implements novel Amplify-instruct methodology
- 60% multi-turn conversation training
- Average 1,000 tokens per conversation example
Core Capabilities
- Extended context handling up to 200K tokens
- Complex summary generation for advanced topics
- Multi-turn conversation proficiency
- Knowledge cutoff up to late 2022
- Advanced reasoning and philosophical discussion capabilities
Frequently Asked Questions
Q: What makes this model unique?
The model's uniqueness lies in its efficient training approach, requiring only 20K examples to achieve competitive performance, and its extensive 200K context window, making it ideal for long-form content processing.
Q: What are the recommended use cases?
The model excels in complex summarization tasks, multi-turn conversations, philosophical discussions, and handling lengthy context-dependent queries. It's particularly suitable for applications requiring deep reasoning and extended context understanding.