fashion-clip

fashion-clip

patrickjohncyh

Fashion-CLIP: A 151M parameter vision-language model fine-tuned on 800K fashion products for zero-shot classification and general fashion concept understanding.

PropertyValue
Parameter Count151M
LicenseMIT
PaperScientific Reports
ArchitectureViT-B/32 + Transformer

What is fashion-clip?

Fashion-CLIP is a specialized adaptation of the CLIP architecture, fine-tuned specifically for fashion-related tasks. Built upon the LAION CLIP checkpoint, it's trained on 800K fashion products from the Farfetch dataset to understand and represent fashion concepts in both visual and textual forms.

Implementation Details

The model employs a ViT-B/32 Transformer for image encoding and a masked self-attention Transformer for text encoding. It's trained using contrastive learning to maximize the similarity between matched image-text pairs from fashion products.

  • Utilizes white-background product images and detailed text descriptions
  • Trained on concatenated product highlights and descriptions
  • Achieves superior performance compared to base CLIP models on fashion tasks

Core Capabilities

  • Zero-shot fashion product classification
  • Cross-modal fashion concept understanding
  • Product attribute detection and matching
  • Improved performance on fashion-specific benchmarks (FMNIST: 0.83, KAGL: 0.73, DEEP: 0.62)

Frequently Asked Questions

Q: What makes this model unique?

Fashion-CLIP 2.0 demonstrates significant improvements over both the original CLIP and previous Fashion-CLIP versions, particularly in fashion-specific zero-shot transfer tasks. It leverages the LAION checkpoint's broader training data while maintaining specialized fashion understanding.

Q: What are the recommended use cases?

The model is ideal for e-commerce applications, product categorization, fashion recommendation systems, and zero-shot classification of fashion items. It performs best with standard product images on white backgrounds and detailed textual descriptions.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026