Crystal

Maintained By
LLM360

Crystal

PropertyValue
Parameter Count7 Billion
LicenseApache 2.0
PaperResearch Paper
Training DataSlimPajama and StarCoder
ArchitectureLLaMA-based with muP modifications

What is Crystal?

Crystal is an advanced 7B parameter language model that represents a significant achievement in balanced language model development. Trained on 1.4 trillion tokens from SlimPajama and StarCoder datasets, it demonstrates exceptional capabilities in both natural language processing and coding tasks. Despite using fewer training tokens than LLaMA 2, Crystal achieves superior performance on several benchmarks including MMLU, HumanEval, and MBPP.

Implementation Details

Crystal utilizes a GPT-like architecture similar to LLaMA but incorporates maximal update parameterization (muP) for enhanced performance. The model features a unique training approach split into three stages, processing different portions of SlimPajama and StarCoder data. The architecture includes specialized embedding scaling, refined attention mechanisms, and custom learning rate optimizations.

  • Custom tokenizer with 32,032 vocabulary size
  • Training sequence length of 2048
  • LayerNorm instead of RMSNorm
  • Rotary position embeddings on 25% of hidden dimensions

Core Capabilities

  • Superior performance in coding tasks (HumanEval, MBPP)
  • Strong natural language understanding (MMLU, ARC)
  • Balanced performance across language and coding tasks
  • Support for filling-in-middle (FIM) inference
  • Specialized handling of code metadata and instruction tuning

Frequently Asked Questions

Q: What makes this model unique?

Crystal stands out for its balanced performance in both coding and language tasks, achieved through innovative muP implementation and strategic three-stage training process. It manages to outperform larger models while using fewer training tokens.

Q: What are the recommended use cases?

The model excels in both programming tasks and natural language processing, making it ideal for code generation, technical documentation, and general language understanding tasks. It's particularly well-suited for applications requiring both coding and natural language capabilities.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.