Manticore-13B

Property	Value
Base Model	LLaMA 13B
Training Infrastructure	8xA100 80GB GPUs
Framework	PyTorch/Transformers
Primary Language	English

What is manticore-13b?

Manticore-13B is an advanced language model built on the LLaMA 13B architecture, fine-tuned on a carefully curated collection of diverse datasets. Developed by the OpenAccess AI Collective, this model represents a significant advancement in instruction-following and general-purpose text generation capabilities.

Implementation Details

The model was trained using the Axolotl framework over 3 epochs, taking approximately 24 hours on 8xA100 80GB GPUs. It incorporates multiple high-quality datasets including ShareGPT, WizardLM, Wizard-Vicuna, and specialized instruction sets for various tasks.

Built with text-generation-inference optimization
Includes GGML quantized versions for efficient deployment
Trained on 10 diverse datasets for comprehensive knowledge

Core Capabilities

Advanced instruction following and task completion
Scientific and logical reasoning (trained on MMLU subset)
Code generation and explanation
Summarization and content generation
Role-playing and creative writing

Frequently Asked Questions

Q: What makes this model unique?

Manticore-13B stands out for its comprehensive training on diverse, high-quality datasets and its ability to handle both technical and creative tasks effectively. Unlike many models, it maintains strong performance without RLHF alignment, making it versatile for various applications.

Q: What are the recommended use cases?

The model excels in code generation, scientific explanation, creative writing, and general instruction following. It's particularly suitable for applications requiring detailed responses in technical fields like physics, logic, and mathematics.

manticore-13b