clip-variants

clip-variants

mlunar

OpenAI CLIP model variants converted to ONNX format, offering multiple architectures (ResNet/ViT) with different precision types (float32/16, qint8, quint8)

PropertyValue
LicenseMIT
FormatONNX
Supported ArchitecturesResNet-50/101, ViT-B/16, ViT-B/32, ViT-L/14
Precision Typesfloat32, float16, qint8, quint8

What is clip-variants?

CLIP-variants is a comprehensive collection of OpenAI's CLIP models converted into ONNX format, offering multiple architecture variants and precision types. The model provides both visual and textual processing capabilities, making it suitable for multimodal tasks.

Implementation Details

The repository contains converted versions of all available OpenAI CLIP models, split into two separate modes: visual and textual processing. Each model variant is available in multiple precision types to accommodate different performance and size requirements.

  • Supports both ResNet and Vision Transformer (ViT) architectures
  • Includes multiple model sizes from compact to large-scale
  • Offers various precision types for flexibility in deployment
  • Provides complete ONNX compatibility

Core Capabilities

  • Zero-shot image classification
  • Visual-textual alignment
  • Multi-modal feature extraction
  • Flexible deployment options with different precision types
  • Support for both CNN and Transformer architectures

Frequently Asked Questions

Q: What makes this model unique?

This model collection provides ONNX-converted variants of CLIP, making it easier to deploy in various environments while offering multiple precision options for balancing performance and resource usage.

Q: What are the recommended use cases?

The models are suitable for zero-shot image classification, visual-textual alignment tasks, and general multimodal applications where image and text understanding is required. However, careful evaluation is recommended for specific deployment contexts.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026