CausalLM-14B-GGUF

Maintained By
TheBloke

CausalLM-14B-GGUF

PropertyValue
Parameter Count14.2B
LicenseWTFPL
FormatGGUF (Various Quantizations)
LanguagesEnglish, Chinese

What is CausalLM-14B-GGUF?

CausalLM-14B-GGUF is a powerful large language model that has been optimized and quantized for efficient deployment. Based on the architecture of LLaMA2, this model demonstrates exceptional performance across various benchmarks, notably achieving 67.36% accuracy on MMLU and 73.10% on CEval, outperforming many larger models.

Implementation Details

The model utilizes the ChatML prompt format and comes in multiple GGUF quantizations ranging from 2-bit to 8-bit precision. It was trained on a curated dataset of 1.3B tokens, incorporating synthetic data and carefully selected entries from various sources including Wikipedia, Fandom, and Moegirlpedia.

  • Multiple quantization options (Q4_0 through Q8_0) for different size/performance trade-offs
  • Supports context length of 4096 tokens
  • Implements efficient attention calculation method from original LLaMA2

Core Capabilities

  • Strong performance on mathematical reasoning (70.12% on GSM8K)
  • Exceptional multilingual abilities (English and Chinese)
  • 88.26% win rate on AlpacaEval Leaderboard
  • Optimized for both CPU and GPU inference

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its exceptional performance-to-size ratio, outperforming all models under 70B parameters in most quantitative evaluations. It's also optimized for both CPU and GPU deployment through GGUF format.

Q: What are the recommended use cases?

The model excels in both academic and general-purpose tasks, making it suitable for mathematical reasoning, multilingual applications, and general text generation. It's particularly effective for deployments requiring a balance of performance and efficiency.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.