opt-350m

Maintained By
facebook

OPT-350M

PropertyValue
DeveloperMeta AI (Facebook)
ArchitectureDecoder-only Transformer
LicenseOther (Custom)
PaperOpen Pre-trained Transformer Language Models

What is opt-350m?

OPT-350M is part of Meta AI's Open Pre-trained Transformer (OPT) series, designed to democratize access to large language models. This 350M parameter model represents a more accessible version of the technology, trained using causal language modeling on a diverse dataset of 180B tokens.

Implementation Details

The model utilizes GPT2's byte-level Byte Pair Encoding with a vocabulary size of 50,272 tokens and processes sequences of 2,048 consecutive tokens. It's trained on a comprehensive dataset including BookCorpus, CC-Stories, selected components of The Pile, Pushshift.io Reddit data, and CCNewsV2.

  • Pre-training objective: Causal Language Modeling (CLM)
  • Primary language: English (with some multilingual content via CommonCrawl)
  • Training data size: 800GB (180B tokens)
  • Tokenization: GPT2 byte-level BPE

Core Capabilities

  • Text generation and completion
  • Zero-shot and few-shot learning
  • Custom prompt-based tasks
  • Research and experimentation

Frequently Asked Questions

Q: What makes this model unique?

OPT-350M stands out for its open-access nature and research-friendly design, allowing researchers to study large language model behavior while requiring fewer computational resources than larger variants.

Q: What are the recommended use cases?

The model is best suited for text generation tasks, research purposes, and fine-tuning for specific downstream applications. It can be easily implemented using the Hugging Face transformers library for both inference and fine-tuning.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.