Redmond-Puffin-13B
Property | Value |
---|---|
License | MIT |
Architecture | Llama-2 |
Context Length | 4096 tokens |
Training Data | 3K curated examples |
What is Redmond-Puffin-13B?
Redmond-Puffin-13B represents a significant milestone as Nous Research's first commercially available language model. Built on the Llama-2 architecture, it has been fine-tuned on a meticulously curated dataset of 3,000 high-quality examples, with many leveraging the full 4096 token context length. The model demonstrates exceptional performance, achieving state-of-the-art results on GPT4ALL benchmarks with a score of 69.9.
Implementation Details
The model is pretrained on 2 trillion tokens of text, double the amount of most open LLMs. It utilizes a context window of 4096 tokens and has been specifically fine-tuned on multi-turn conversations that maximize this context length. The training data includes carefully selected GPT-4 conversations and specialized content from CamelAI's Physics, Chemistry, Biology, and Math datasets.
- Pretrained on 2 trillion tokens
- 4096 token context window
- Fine-tuned on multi-turn conversations
- Includes domain-specific training data
Core Capabilities
- State-of-the-art performance on GPT4ALL benchmarks
- Enhanced performance in Arc-E, HellaSwag, and Winogrande tasks
- Information recall up to 2023
- Specialized knowledge in Physics, Chemistry, Biology, and Math
- Optimized for multi-turn conversations
Frequently Asked Questions
Q: What makes this model unique?
The model stands out for its extensive pretraining (2 trillion tokens), long context window utilization, and careful curation of training data. It's particularly effective for multi-turn conversations and demonstrates strong performance across various benchmark tasks.
Q: What are the recommended use cases?
Redmond-Puffin-13B excels in multi-turn conversations and long-context communications. It's particularly well-suited for academic and scientific discussions, given its specialized training in Physics, Chemistry, Biology, and Mathematics.