14B

Maintained By
CausalLM

CausalLM 14B

PropertyValue
LicenseWTFPL
LanguagesEnglish, Chinese, Japanese
FrameworkPyTorch
Training Data1.3B tokens across 20 datasets

What is CausalLM 14B?

CausalLM 14B is a state-of-the-art language model that achieves remarkable performance across multiple benchmarks. Built on LLaMA 2's architecture, it demonstrates exceptional capabilities in both English and Chinese language tasks, while showing impressive cross-lingual transfer abilities.

Implementation Details

The model utilizes the standard LLaMA 2 architecture without additional RoPE scaling, trained on a carefully curated dataset of 1.3B tokens. It's fully compatible with various quantization methods, though the developers recommend using the base model when possible.

  • Achieves 67.36% average accuracy on MMLU
  • Scores 73.10% on CEval, outperforming GPT-4
  • Zero-shot accuracy of 70.13% on GSM8K
  • DPO version ranks #1 among ~13B models

Core Capabilities

  • Multi-lingual understanding and generation
  • Strong performance in STEM and humanities tasks
  • Efficient speculative sampling capabilities
  • Compatible with visual instruction fine-tuning

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its exceptional performance-to-size ratio, outperforming all models under 70B in most quantitative evaluations. It also offers remarkable cross-lingual capabilities despite focused training.

Q: What are the recommended use cases?

The model excels in general text generation, academic question-answering, and multilingual tasks. It's particularly suitable for applications requiring strong reasoning capabilities in both English and Chinese.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.