whisper-v3-tiny-random

Property	Value
Author	yujiepan
Base Model	Whisper-large-v3
Model Type	Speech Recognition
HuggingFace URL	Link

What is whisper-v3-tiny-random?

whisper-v3-tiny-random is a specialized debugging model derived from OpenAI's Whisper-large-v3 architecture. It features a dramatically reduced architecture size while maintaining the same configuration structure, making it ideal for testing and development purposes.

Implementation Details

The model implements a minimalist version of the Whisper architecture with the following specifications: 2 hidden layers, 8-dimensional model size, 2 attention heads in both encoder and decoder, and 16-dimensional feed-forward networks. All parameters are randomly initialized within a uniform distribution between -0.5 and 0.5.

Reduced architecture: 2 encoder and decoder layers
8-dimensional model size (d_model)
2 attention heads in both encoder and decoder
16-dimensional feed-forward networks
Random initialization with seed 42

Core Capabilities

Automatic speech recognition with timestamp support
Compatible with Hugging Face's pipeline architecture
Supports half-precision (float16) computation
Includes feature extraction and tokenization capabilities

Frequently Asked Questions

Q: What makes this model unique?

This model's primary purpose is debugging and development. Its uniqueness lies in maintaining the Whisper-v3 architecture while dramatically reducing its size and using random initialization, making it perfect for testing pipelines and implementations.

Q: What are the recommended use cases?

The model is specifically designed for debugging purposes, pipeline testing, and development workflows. It should not be used for production speech recognition tasks as its parameters are randomly initialized.