llava-v1.5-13b

llava-v1.5-13b

liuhaotian

LLaVA-v1.5-13B is a powerful multimodal chatbot combining vision and language capabilities, built on LLaMA/Vicuna with 558K+ training pairs.

PropertyValue
Release DateSeptember 2023
LicenseLLAMA 2 Community License
Project Websitehttps://llava-vl.github.io/
FrameworkPyTorch

What is llava-v1.5-13b?

LLaVA-v1.5-13B is an advanced multimodal AI model that combines vision and language capabilities. It's built by fine-tuning the LLaMA/Vicuna architecture on a diverse dataset of image-text pairs and instruction-following data. This model represents a significant advancement in multimodal AI, capable of understanding and responding to both visual and textual inputs.

Implementation Details

The model is implemented using PyTorch and follows an auto-regressive transformer architecture. It's trained on a comprehensive dataset including:

  • 558,000 filtered image-text pairs from LAION/CC/SBU with BLIP captions
  • 158,000 GPT-generated multimodal instruction-following examples
  • 450,000 academic VQA data points
  • 40,000 ShareGPT interactions

Core Capabilities

  • Image-Text understanding and generation
  • Visual Question Answering (VQA)
  • Multimodal instruction following
  • Academic task handling
  • Natural conversation with visual context

Frequently Asked Questions

Q: What makes this model unique?

LLaVA-v1.5-13B stands out for its comprehensive training on both academic and instruction-following datasets, making it equally capable in research and practical applications. It's evaluated across 12 different benchmarks, demonstrating robust performance in both visual question answering and general multimodal tasks.

Q: What are the recommended use cases?

The model is primarily intended for research purposes in computer vision, natural language processing, and AI. It's particularly suitable for researchers and hobbyists working on multimodal AI applications, visual question answering systems, and advanced chatbots with image understanding capabilities.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026