GALACTICA 120B
Property | Value |
---|---|
Parameters | 120 Billion |
License | CC BY-NC 4.0 |
Release Date | November 2022 |
Training Data | 106B tokens of scientific text |
Paper | Research Paper |
What is galactica-120b?
GALACTICA 120B is Facebook's largest scientific language model, designed specifically for scientific tasks and research. It represents a significant advancement in AI-powered scientific computing, trained on a massive corpus of scientific literature, textbooks, and reference materials. This model is part of the GALACTICA family, which includes smaller variants ranging from 125M to 120B parameters.
Implementation Details
The model utilizes a transformer-based architecture in a decoder-only setup, implemented using PyTorch. It can be deployed using various precision options including FP16 and INT8 for efficient inference, and supports both CPU and GPU execution through the Hugging Face transformers library.
- Supports multiple scientific tasks including citation prediction and mathematical reasoning
- Implements specialized tokenization for different scientific modalities
- Offers flexible deployment options with various hardware configurations
Core Capabilities
- Scientific Question Answering
- Mathematical Reasoning
- Document Summarization
- Molecular Property Prediction
- Entity Extraction
- Citation Prediction
Frequently Asked Questions
Q: What makes this model unique?
GALACTICA 120B stands out due to its specialized training on scientific content and its ability to handle multiple scientific modalities. It outperforms other language models on knowledge-intensive scientific tasks while exhibiting lower toxicity rates compared to other large language models.
Q: What are the recommended use cases?
The model is primarily intended for researchers studying language models in scientific domains and developers building scientific tools. However, it's important to note that production use should include appropriate safeguards due to potential hallucination risks.