phi-2

Maintained By
SkunkworksAI

Microsoft Phi-2

PropertyValue
Parameter Count2.78B
Model TypeTransformer-based Language Model
Training Data250B tokens
LicenseMicrosoft Research License
Training Infrastructure96xA100-80G GPUs, 14 days training

What is phi-2?

Phi-2 is Microsoft's state-of-the-art language model designed specifically for research purposes. With 2.7 billion parameters, it represents a significant achievement in creating efficient, smaller-scale models that can compete with larger counterparts in tasks requiring common sense, language understanding, and logical reasoning.

Implementation Details

Built on PyTorch and utilizing DeepSpeed with flash-attention >2.0.0, Phi-2 was trained on 1.4T tokens using a combination of NLP synthetic data created by GPT-3.5 and carefully filtered web data from Falcon RefinedWeb and SlimPajama. The training process was validated using GPT-4 to ensure quality and safety.

  • Architecture: Transformer-based model with next-word prediction
  • Training Infrastructure: 96 A100-80G GPUs
  • Training Duration: 14 days
  • Framework: PyTorch with DeepSpeed optimization

Core Capabilities

  • Question-Answering with high accuracy
  • Natural chat interactions
  • Python code generation
  • Common sense reasoning
  • Language understanding tasks

Frequently Asked Questions

Q: What makes this model unique?

Phi-2 stands out for achieving near state-of-the-art performance among models under 10B parameters, without using reinforcement learning from human feedback. It's specifically designed for research purposes with a focus on safety and educational value.

Q: What are the recommended use cases?

The model is best suited for research applications in QA format, chat format, and code generation, particularly in Python. It's important to note that it's not intended for production use and should be used primarily for research purposes, especially in exploring safety challenges and model controllability.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.