squeezebert-mnli

squeezebert-mnli

squeezebert

SqueezeBERT-MNLI: Efficient BERT variant using grouped convolutions, 4.3x faster on mobile devices, pretrained on BookCorpus/Wikipedia and finetuned for MNLI tasks.

PropertyValue
LicenseBSD
PaperSqueezeBERT Paper
Training DataBookCorpus, Wikipedia, MNLI
Primary LanguageEnglish

What is squeezebert-mnli?

SqueezeBERT-MNLI is an efficient transformer model that revolutionizes natural language processing by replacing traditional fully-connected layers with grouped convolutions. This model has been specifically pretrained on BookCorpus and Wikipedia, then finetuned on the Multi-Genre Natural Language Inference (MNLI) dataset. Most notably, it achieves 4.3x faster performance than BERT-base-uncased on mobile devices like the Google Pixel 3.

Implementation Details

The model was pretrained using the LAMB optimizer with specific hyperparameters: a global batch size of 8192, learning rate of 2.5e-3, and warmup proportion of 0.28. The training process involved 56,000 steps with a sequence length of 128, followed by 6,000 steps with a sequence length of 512. The architecture maintains BERT-base's structure but innovates with grouped convolutions for improved efficiency.

  • Case-insensitive processing
  • Trained using MLM and SOP objectives
  • Implements "bells and whistles" finetuning approach with MNLI
  • Optimized for mobile deployment

Core Capabilities

  • Natural Language Inference tasks
  • Efficient mobile deployment
  • Text classification
  • Sequence understanding

Frequently Asked Questions

Q: What makes this model unique?

SqueezeBERT-MNLI's primary distinction is its use of grouped convolutions instead of traditional fully-connected layers, resulting in significantly faster performance on mobile devices while maintaining competitive accuracy.

Q: What are the recommended use cases?

The model is particularly well-suited for mobile applications requiring natural language inference, text classification, and general language understanding tasks where computational efficiency is crucial.

Related Models

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026