GIST-small-Embedding-v0

GIST-small-Embedding-v0

avsolatorio

GIST-small-Embedding-v0 is a 33.4M parameter text embedding model fine-tuned on MEDI dataset and MTEB Classification data, optimized for semantic similarity tasks without requiring instructions.

PropertyValue
Parameter Count33.4M
LicenseMIT
PaperGISTEmbed Paper
Base ModelBAAI/bge-small-en-v1.5

What is GIST-small-Embedding-v0?

GIST-small-Embedding-v0 is a specialized text embedding model that implements the Guided In-sample Selection of Training Negatives (GIST) approach. Fine-tuned on the BAAI/bge-small-en-v1.5 base model, it leverages both the MEDI dataset and MTEB Classification training data to generate high-quality text embeddings without requiring explicit instructions.

Implementation Details

The model was trained with specific parameters including 40 epochs, 0.1 warmup ratio, and 5e-6 learning rate. It employs a contrastive loss temperature of 0.01 and uses a batch size of 16. The architecture is optimized for generating semantic embeddings that can be used directly for various NLP tasks.

  • No instruction required for embedding generation
  • Trained on combined MEDI and MTEB Classification datasets
  • Optimized checkpoint selection at 102,000 steps

Core Capabilities

  • Strong performance in semantic similarity tasks
  • Effective for classification and clustering applications
  • Robust performance in retrieval tasks
  • High accuracy in pair classification scenarios

Frequently Asked Questions

Q: What makes this model unique?

The model's uniqueness lies in its ability to generate high-quality embeddings without requiring instructions, while leveraging the GIST approach for optimal negative sample selection during training. This makes it particularly efficient for practical applications.

Q: What are the recommended use cases?

The model excels in semantic similarity tasks, document retrieval, clustering, and classification applications. It's particularly well-suited for scenarios where instruction-free embedding generation is desired.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026