GALACTICA 120B

Property	Value
Parameters	120 Billion
License	CC BY-NC 4.0
Release Date	November 2022
Training Data	106B tokens of scientific text
Paper	Research Paper

What is galactica-120b?

GALACTICA 120B is Facebook's largest scientific language model, designed specifically for scientific tasks and research. It represents a significant advancement in AI-powered scientific computing, trained on a massive corpus of scientific literature, textbooks, and reference materials. This model is part of the GALACTICA family, which includes smaller variants ranging from 125M to 120B parameters.

Implementation Details

The model utilizes a transformer-based architecture in a decoder-only setup, implemented using PyTorch. It can be deployed using various precision options including FP16 and INT8 for efficient inference, and supports both CPU and GPU execution through the Hugging Face transformers library.

Supports multiple scientific tasks including citation prediction and mathematical reasoning
Implements specialized tokenization for different scientific modalities
Offers flexible deployment options with various hardware configurations

Core Capabilities

Scientific Question Answering
Mathematical Reasoning
Document Summarization
Molecular Property Prediction
Entity Extraction
Citation Prediction

Frequently Asked Questions

Q: What makes this model unique?

GALACTICA 120B stands out due to its specialized training on scientific content and its ability to handle multiple scientific modalities. It outperforms other language models on knowledge-intensive scientific tasks while exhibiting lower toxicity rates compared to other large language models.

Q: What are the recommended use cases?

The model is primarily intended for researchers studying language models in scientific domains and developers building scientific tools. However, it's important to note that production use should include appropriate safeguards due to potential hallucination risks.

galactica-120b