Falcon3-3B-Instruct

Maintained By
tiiuae

Falcon3-3B-Instruct

PropertyValue
Parameter Count3 Billion
Context Length32K tokens
LanguagesEnglish, French, Spanish, Portuguese
LicenseTII Falcon-LLM License 2.0
Release DateDecember 2024

What is Falcon3-3B-Instruct?

Falcon3-3B-Instruct is a powerful instruction-tuned language model developed by the Technology Innovation Institute. It's a pruned and optimized version of Falcon3-7B-Base, specifically designed for high-performance reasoning, STEM tasks, and multilingual capabilities. The model was trained on 100 Gigatokens of diverse datasets and further refined with 1.2 million samples of specialized content.

Implementation Details

The model implements a transformer-based causal decoder-only architecture with 22 decoder blocks. It features Grouped Query Attention (GQA) with 12 query heads and 4 key-value heads, enabling faster inference. The architecture incorporates SwiGLU activation and RMSNorm, with a high RoPE value of 1000042 for enhanced long-context understanding.

  • Wide head dimension of 256
  • 131K vocabulary size
  • 32K context length support
  • Optimized using 1024 H100 GPU chips

Core Capabilities

  • Strong performance in STEM and mathematical reasoning (78% on GSM8K with Chain-of-Thought)
  • Exceptional results in scientific understanding (95.5% on SciQ)
  • Robust multilingual support across four languages
  • Advanced reasoning capabilities (45.4% on BBH benchmark)
  • Effective instruction following with 7.2 MT-Bench average score

Frequently Asked Questions

Q: What makes this model unique?

Falcon3-3B-Instruct stands out for its efficient architecture that achieves strong performance despite its relatively compact size. It particularly excels in STEM and scientific tasks, outperforming larger models in specific benchmarks like SciQ and MATH Level-5.

Q: What are the recommended use cases?

The model is particularly well-suited for scientific and mathematical applications, multilingual content generation, and tasks requiring complex reasoning. It's ideal for applications needing strong performance in STEM fields while maintaining moderate computational requirements.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.