Guanaco-65B
Property | Value |
---|---|
Author | timdettmers |
Architecture | LLaMA-based with LoRA adapters |
License | Apache 2.0 (adapter weights) |
Paper | QLoRA Paper |
What is guanaco-65b?
Guanaco-65B is a sophisticated open-source language model that represents the largest variant in the Guanaco family. Built on the LLaMA architecture, it's fine-tuned using the innovative 4-bit QLoRA technique on the OASST1 dataset. This model has demonstrated performance competitive with commercial chatbot systems like ChatGPT and BARD, particularly in human and GPT-4 evaluations.
Implementation Details
The model utilizes 4-bit quantization with LoRA adapters (r=64) applied to all linear layers. It employs NormalFloat4 datatype for the base model with BFloat16 as computation datatype. Training parameters include a learning rate of 1e-4, batch size of 16, and sequence length of 512.
- 4-bit QLoRA fine-tuning methodology
- Lightweight adapter-based architecture
- Efficient training procedure with constant learning rate schedule
- Paged AdamW optimizer implementation
Core Capabilities
- Multi-language response generation
- Competitive performance on Vicuna and OpenAssistant benchmarks
- Strong performance on MMLU benchmark (62.2% accuracy)
- Reduced bias compared to base models (as measured on CrowS dataset)
- Local deployment capability
Frequently Asked Questions
Q: What makes this model unique?
Guanaco-65B stands out for its efficient 4-bit quantization approach while maintaining performance comparable to commercial systems. It offers a lightweight alternative through adapter-based architecture, making it more accessible for research and local deployment.
Q: What are the recommended use cases?
The model is primarily intended for research purposes and can be used for chatbot applications, language understanding tasks, and studying AI safety and bias. However, it should not be relied upon for factual accuracy in production environments.