TinyOctopus

TinyOctopus

SaraAlthubaiti

TinyOctopus is a bilingual Audio-LLM combining Distil-Whisper and DeepSeek 1.5B for Arabic/English speech processing with 70.59% dialect accuracy.

PropertyValue
AuthorSaraAlthubaiti
Model TypeBilingual Audio Language Model
ArchitectureDistil-Whisper + DeepSeek 1.5B
Model URLHugging Face

What is TinyOctopus?

TinyOctopus is an innovative bilingual Audio Language Model designed for processing and generating text from audio inputs in both Arabic and English. The model combines Distil-Whisper for audio encoding with DeepSeek 1.5B for text generation, connected through a cross-attention projection layer.

Implementation Details

The model architecture consists of three main components: Distil-Whisper (distil-large-v3) for audio encoding, a trainable cross-attention projection layer for feature alignment, and DeepSeek 1.5B as the core language model. It has been trained on substantial datasets including QASR (2,000 hours of Arabic speech) and ADI17 (3,000 hours of dialect-specific content).

  • Arabic ASR Performance: 16.00% WER
  • English ASR Performance: 4.50% WER
  • Translation BLEU Score: 55.05 (GPT-4o)
  • Dialect Identification Accuracy: 70.59%

Core Capabilities

  • Bilingual Automatic Speech Recognition (ASR)
  • Arabic to English Speech Translation
  • Arabic Dialect Identification
  • Multi-dialect Speech Processing

Frequently Asked Questions

Q: What makes this model unique?

TinyOctopus stands out for its bilingual capabilities and specialized Arabic dialect processing, achieving competitive performance in both ASR and translation tasks while maintaining efficiency through its distilled architecture.

Q: What are the recommended use cases?

The model is ideal for automatic transcription of Arabic and English speech, Arabic-to-English translation, and Arabic dialect identification in various contexts such as broadcast media, academic research, and general speech processing applications.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026