InternVL2-8B-MPO

Property	Value
Parameter Count	8.08B
Model Type	Multimodal LLM
License	MIT
Paper	arXiv:2411.10442
Tensor Type	BF16

What is InternVL2-8B-MPO?

InternVL2-8B-MPO is an advanced multimodal large language model that enhances the original InternVL2-8B through Mixed Preference Optimization (MPO). The model addresses the challenge of distribution shifts in multimodal reasoning by incorporating a novel preference optimization process.

Implementation Details

The model builds upon InternVL2-8B and introduces several key technical innovations: an automated preference data construction pipeline creating the MMPR dataset, and a Mixed Preference Optimization approach that significantly improves multimodal Chain-of-Thought (CoT) performance.

Achieves 67.0% accuracy on MathVista, outperforming base model by 8.7 points
Implements advanced visual-linguistic processing capabilities
Supports multiple deployment options including 4-bit and 8-bit quantization

Core Capabilities

Enhanced multimodal reasoning and Chain-of-Thought performance
Reduced hallucination compared to base model
Support for multi-image and video processing
Multilingual capabilities
Streaming output support

Frequently Asked Questions

Q: What makes this model unique?

The model's distinctive feature is its Mixed Preference Optimization approach, which significantly improves multimodal reasoning capabilities while maintaining efficient performance with just 8B parameters.

Q: What are the recommended use cases?

The model excels in multimodal reasoning tasks, image-text interactions, visual question answering, and complex visual analysis scenarios. It's particularly strong in tasks requiring detailed reasoning about visual content.

InternVL2-8B-MPO

InternVL2-8B-MPO

What is InternVL2-8B-MPO?

Implementation Details

Core Capabilities

Frequently Asked Questions

Q: What makes this model unique?

Q: What are the recommended use cases?

Related Models