bge-small-en-v1.5

Maintained By
michaelfeil

bge-small-en-v1.5

PropertyValue
Parameter Count33.4M
LicenseMIT
Downloads123,821
LanguageEnglish
Framework SupportPyTorch, ONNX

What is bge-small-en-v1.5?

bge-small-en-v1.5 is a lightweight embedding model designed for efficient sentence similarity and feature extraction tasks. As part of the Infinity project, it serves as the stable default model for generating high-quality text embeddings while maintaining a relatively small parameter footprint of 33.4M parameters.

Implementation Details

The model can be deployed using the infinity_emb package, offering flexible deployment options including GPU acceleration with PyTorch and CPU optimization through ONNX. It supports both synchronous and asynchronous embedding generation, with built-in support for flash attention on GPU implementations.

  • Supports both PyTorch and ONNX inference engines
  • Compatible with CPU and CUDA devices
  • Implements flash attention for GPU optimization
  • Offers torch.compile support for enhanced performance

Core Capabilities

  • Sentence embedding generation
  • Feature extraction for NLP tasks
  • Sentence similarity computation
  • Efficient processing through multiple inference backends

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its efficient architecture, balancing performance with a compact size of 33.4M parameters, while providing robust embedding capabilities through the Infinity framework.

Q: What are the recommended use cases?

The model is ideal for applications requiring sentence embeddings, text similarity comparisons, and feature extraction tasks. It's particularly well-suited for production environments where both CPU and GPU deployment options are needed.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.