llava-13b-v0-4bit-128g

Maintained By
wojtab

llava-13b-v0-4bit-128g

PropertyValue
Original ModelLLaVA-13B-v0
Quantization4-bit GPTQ
Group Size128
LicenseOther
Authorwojtab

What is llava-13b-v0-4bit-128g?

llava-13b-v0-4bit-128g is a quantized version of the LLaVA (Large Language and Vision Assistant) model, specifically optimized for efficient deployment while maintaining performance. This implementation uses 4-bit quantization with a group size of 128, significantly reducing the model's memory footprint while preserving its multimodal capabilities.

Implementation Details

The model was quantized using the GPTQ-for-LLaMa framework (CUDA branch, commit 57a2629), specifically targeting the LLaMa component of the LLaVA model. The quantization process was executed using true-sequential processing and custom grouping parameters to optimize the compression while maintaining model quality.

  • Utilizes 4-bit quantization for efficient memory usage
  • Implements 128-group size for balanced compression and accuracy
  • Compatible with text-generation-webui's LLaVA extension
  • Based on the original LLaVA-13B-delta-v0 model

Core Capabilities

  • Efficient memory utilization through 4-bit quantization
  • Maintains multimodal understanding capabilities
  • Seamless integration with text-generation-webui
  • Reduced storage requirements while preserving functionality

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its efficient 4-bit quantization of the LLaVA architecture, making it more accessible for deployment on hardware with limited resources while maintaining the core capabilities of the original model.

Q: What are the recommended use cases?

The model is ideal for applications requiring multimodal understanding in resource-constrained environments, particularly when using the text-generation-webui interface with the LLaVA extension. It's suitable for tasks that require both vision and language processing capabilities.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.