reka-flash-3

Maintained By
RekaAI

Reka Flash 3

PropertyValue
Parameter Count21B
Model TypeGeneral-purpose reasoning
ArchitectureLlama-compatible
Hugging FaceRekaAI/reka-flash-3
Tokenizercl100k_base

What is reka-flash-3?

Reka Flash 3 is a powerful 21B parameter language model designed for general-purpose reasoning tasks. Trained from scratch using synthetic and public datasets, it underwent supervised fine-tuning and RLOO with model-based and rule-based rewards. The model achieves competitive performance comparable to proprietary models like OpenAI o1-mini, making it particularly suitable for applications requiring low latency or on-device deployment.

Implementation Details

The model follows a Llama-compatible format for easy deployment and can be integrated using any Llama-compatible library. It utilizes the cl100k_base tokenizer without additional special tokens and implements a specific prompt format for conversation handling.

  • Supports multi-round conversations with a clear prompt structure
  • Implements budget forcing mechanism for controlled thinking processes
  • Compatible with popular frameworks like Hugging Face and vLLM
  • Primary focus on English language with limited support for other languages

Core Capabilities

  • General-purpose reasoning tasks
  • Low-latency inference
  • On-device deployment support
  • Web browsing and code execution through Nexus platform
  • Document, image, video, and audio analysis

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its combination of size (21B parameters) and efficiency, making it currently the best open model in its size category. Its ability to perform competitively with proprietary models while maintaining deployment flexibility is particularly noteworthy.

Q: What are the recommended use cases?

The model is best suited for general reasoning tasks and applications requiring low latency. However, for knowledge-intensive tasks, it's recommended to couple the model with web search capabilities. It's particularly effective when deployed through the Nexus platform for organizational AI workers with deep research capabilities.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.