Dolly-v2-12b
Property | Value |
---|---|
Base Model | Pythia-12b |
Parameters | 12 billion |
License | MIT |
Training Data | databricks-dolly-15k |
Language | English |
What is dolly-v2-12b?
Dolly-v2-12b is an instruction-following large language model developed by Databricks, built upon EleutherAI's Pythia-12b architecture. This model represents a significant milestone as one of the first commercially viable, open-source instruction-tuned LLMs. It was fine-tuned on approximately 15,000 instruction/response pairs generated by Databricks employees, covering various capability domains including brainstorming, classification, closed QA, generation, information extraction, open QA, and summarization.
Implementation Details
The model utilizes the transformer architecture and is optimized for instruction-following tasks. It can be easily implemented using the Hugging Face Transformers library, supporting both standard pipeline usage and integration with frameworks like LangChain. The model supports bfloat16 precision to optimize memory usage without significantly impacting performance.
- Built on Pythia-12b architecture
- Fine-tuned on databricks-dolly-15k dataset
- Supports various instruction-following tasks
- Compatible with Transformers and LangChain frameworks
Core Capabilities
- Instruction following across multiple domains
- Natural language generation and understanding
- Question answering (both open and closed)
- Text summarization and information extraction
- Classification tasks
- Brainstorming and creative generation
Frequently Asked Questions
Q: What makes this model unique?
Dolly-v2-12b stands out for being one of the first commercially viable, open-source instruction-tuned LLMs. Its MIT license makes it suitable for commercial applications, while its instruction-following capabilities make it versatile for various NLP tasks.
Q: What are the recommended use cases?
The model is best suited for instruction-following tasks such as question answering, summarization, information extraction, and creative text generation. However, it's important to note that it may struggle with complex mathematical operations, programming problems, and highly specific factual queries.