Guanaco-65B

Property	Value
Author	timdettmers
Architecture	LLaMA-based with LoRA adapters
License	Apache 2.0 (adapter weights)
Paper	QLoRA Paper

What is guanaco-65b?

Guanaco-65B is a sophisticated open-source language model that represents the largest variant in the Guanaco family. Built on the LLaMA architecture, it's fine-tuned using the innovative 4-bit QLoRA technique on the OASST1 dataset. This model has demonstrated performance competitive with commercial chatbot systems like ChatGPT and BARD, particularly in human and GPT-4 evaluations.

Implementation Details

The model utilizes 4-bit quantization with LoRA adapters (r=64) applied to all linear layers. It employs NormalFloat4 datatype for the base model with BFloat16 as computation datatype. Training parameters include a learning rate of 1e-4, batch size of 16, and sequence length of 512.

4-bit QLoRA fine-tuning methodology
Lightweight adapter-based architecture
Efficient training procedure with constant learning rate schedule
Paged AdamW optimizer implementation

Core Capabilities

Multi-language response generation
Competitive performance on Vicuna and OpenAssistant benchmarks
Strong performance on MMLU benchmark (62.2% accuracy)
Reduced bias compared to base models (as measured on CrowS dataset)
Local deployment capability

Frequently Asked Questions

Q: What makes this model unique?

Guanaco-65B stands out for its efficient 4-bit quantization approach while maintaining performance comparable to commercial systems. It offers a lightweight alternative through adapter-based architecture, making it more accessible for research and local deployment.

Q: What are the recommended use cases?

The model is primarily intended for research purposes and can be used for chatbot applications, language understanding tasks, and studying AI safety and bias. However, it should not be relied upon for factual accuracy in production environments.

guanaco-65b