Redmond-Puffin-13B

Property	Value
License	MIT
Architecture	Llama-2
Context Length	4096 tokens
Training Data	3K curated examples

What is Redmond-Puffin-13B?

Redmond-Puffin-13B represents a significant milestone as Nous Research's first commercially available language model. Built on the Llama-2 architecture, it has been fine-tuned on a meticulously curated dataset of 3,000 high-quality examples, with many leveraging the full 4096 token context length. The model demonstrates exceptional performance, achieving state-of-the-art results on GPT4ALL benchmarks with a score of 69.9.

Implementation Details

The model is pretrained on 2 trillion tokens of text, double the amount of most open LLMs. It utilizes a context window of 4096 tokens and has been specifically fine-tuned on multi-turn conversations that maximize this context length. The training data includes carefully selected GPT-4 conversations and specialized content from CamelAI's Physics, Chemistry, Biology, and Math datasets.

Pretrained on 2 trillion tokens
4096 token context window
Fine-tuned on multi-turn conversations
Includes domain-specific training data

Core Capabilities

State-of-the-art performance on GPT4ALL benchmarks
Enhanced performance in Arc-E, HellaSwag, and Winogrande tasks
Information recall up to 2023
Specialized knowledge in Physics, Chemistry, Biology, and Math
Optimized for multi-turn conversations

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its extensive pretraining (2 trillion tokens), long context window utilization, and careful curation of training data. It's particularly effective for multi-turn conversations and demonstrates strong performance across various benchmark tasks.

Q: What are the recommended use cases?

Redmond-Puffin-13B excels in multi-turn conversations and long-context communications. It's particularly well-suited for academic and scientific discussions, given its specialized training in Physics, Chemistry, Biology, and Mathematics.