35b-beta-long

Maintained By
CausalLM

35b-beta-long

PropertyValue
Parameter Count35B
Tensor TypeBF16
LicenseWTFPL
LanguagesEnglish, Chinese, Japanese, German
Context Length128K

What is 35b-beta-long?

35b-beta-long is an advanced multilingual language model built upon Cohere's 35B-parameter architecture. This model represents a significant advancement in long-context language processing, featuring extensive training on over 30 million multi-turn dialogue entries. The model utilizes CohereForAI/c4ai-command-r-v01 as its foundation, chosen specifically for its superior responsiveness to high-quality training data.

Implementation Details

The model employs a sophisticated training approach incorporating BF16 precision and a full 128K context window. The training process involved synthesis of data from multiple web-pages and documents, with substantial human oversight to ensure quality. The architecture leverages existing SOTA LLMs combined with human guidance for enhanced information synthesis.

  • Trained on 18 diverse datasets including GuanacoDataset, MetaMathQA, and WizardLM
  • Implements ChatML template for tokenization
  • Features basic safety measures using refusal datasets
  • Optimized for long-context performance without specific formatting requirements

Core Capabilities

  • Enhanced long-context processing up to 128K tokens
  • Reduced hallucination tendency through fact-based training
  • Improved mathematical and coding capabilities
  • Superior knowledge recall and thematic summarization
  • Multi-language support across English, Chinese, Japanese, and German

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its extensive training on synthesized dialogue data and its ability to handle long contexts effectively without compromising performance. It demonstrates capabilities comparable to models twice its size while maintaining high accuracy in information recall and synthesis.

Q: What are the recommended use cases?

The model excels in scenarios requiring long document processing, multi-language support, and complex information synthesis. It's particularly suitable for document analysis, thematic summarization, and general conversational tasks across supported languages.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.