UltraFastBERT-1x11-long

Property	Value
Parameter Count	189M
License	MIT
Paper	arXiv:2311.10770
Tensor Type	F32
Training Data	EleutherAI/pile

What is UltraFastBERT-1x11-long?

UltraFastBERT-1x11-long is a groundbreaking BERT variant that achieves remarkable efficiency by using only 0.3% of its neurons during inference while maintaining performance comparable to traditional BERT models. The model selectively engages just 12 out of 4095 neurons for each layer inference, implementing fast feedforward networks (FFFs) instead of traditional feedforward layers.

Implementation Details

The model introduces a novel approach to neural network efficiency, achieving a 78x speedup on CPU over optimized baseline feedforward implementation. It utilizes fast feedforward networks (FFFs) and demonstrates a 40x speedup over equivalent batched feedforward inference in PyTorch implementation.

Pretrained on EleutherAI/pile dataset
Implements selective neuron engagement mechanism
Achieves 83.0% average score on GLUE benchmark tasks
Uses MIT license for open development

Core Capabilities

Masked Language Modeling
Efficient inference with minimal computational resources
Strong performance on GLUE tasks (MNLI, QQP, QNLI, SST-2, STS-B, MRPC, RTE)
Compatible with standard transformer libraries

Frequently Asked Questions

Q: What makes this model unique?

The model's ability to use only 0.3% of its neurons during inference while maintaining BERT-level performance makes it unique. It achieves this through innovative fast feedforward networks, resulting in significant speed improvements.

Q: What are the recommended use cases?

The model is primarily intended for research purposes and fine-tuning on downstream tasks like GLUE. However, it's important to note that this is a raw pretraining checkpoint and is currently untested and unfit for deployment in production environments.