cafe-instagram-sd-1-5-v6

cafe-instagram-sd-1-5-v6

cafeai

A Stable Diffusion 1.5-based model trained on 1.2M Instagram images, optimized for Japanese idol/fashion photos with BLIP captioning and booru tags.

PropertyValue
LicenseAGPL-3.0
Base Modelrunwayml/stable-diffusion-v1-5
Training Data1.2M Instagram images
Authorcafeai

What is cafe-instagram-sd-1-5-v6?

cafe-instagram-sd-1-5-v6 is a specialized Stable Diffusion model fine-tuned for generating Instagram-style Japanese idol and fashion photography. Trained on runwayml/stable-diffusion-v1-5 for approximately 1.6 epochs using 1.2M curated Instagram images, this model leverages BLIP natural language descriptions and booru tags for enhanced image generation capabilities.

Implementation Details

The model employs a sophisticated training approach using various aspect ratios with a base resolution of 768x768 and utilizes the penultimate CLIP layer. For optimal results, it's recommended to use a clip skip of 2 and maintain a resolution of 768x768 or higher.

  • Trained on diverse aspect ratios with 768x768 base resolution
  • Implements BLIP captioning and booru tag assistance
  • Incorporates Instagram hashtags in training data
  • Uses penultimate CLIP layer for improved performance

Core Capabilities

  • Generation of photorealistic Japanese idol and fashion photography
  • Support for various Instagram-style aesthetics
  • Enhanced performance with specific prompt structures
  • Specialized in generating realistic portraits and fashion shots

Frequently Asked Questions

Q: What makes this model unique?

This model specializes in Instagram-style Japanese idol and fashion photography, utilizing a combination of BLIP descriptions and booru tags for enhanced generation capabilities. Its training on authentic Instagram content makes it particularly effective for creating realistic social media-style images.

Q: What are the recommended use cases?

The model is best suited for generating fashion photography, idol portraits, and Instagram-style content. It's recommended to use the model with a clip skip of 2 and resolution of 768x768 or higher. For optimal results, mixing with other models may enhance performance due to its undertrained nature.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026