plamo-2-1b

Maintained By
pfnet

PLaMo 2 1B

PropertyValue
Parameter Count1 Billion
Training Tokens4 Trillion
LanguagesEnglish, Japanese
LicenseApache License 2.0
DeveloperPreferred Elements, Inc.

What is plamo-2-1b?

PLaMo 2 1B is an innovative bilingual language model that represents a significant advancement in hybrid architecture design. Developed by Preferred Elements, Inc., it combines the selective State Space Model (SSM) of Mamba with sliding window attention, creating a more efficient and powerful language processing system.

Implementation Details

The model employs a sophisticated training approach across two phases: 3.5T tokens in phase 1 and 0.5T tokens in phase 2. It features enhanced normalization layers for improved training stability and utilizes the Mamba2 kernel for computational efficiency. The tokenizer is optimized using numba, a JIT compiler for numerical functions.

  • Hybrid architecture combining Mamba SSM and sliding window attention
  • Specialized training distribution: 45% English, 30% Japanese, 15% Coding, 10% Other content
  • Enhanced normalization layers for stability
  • Optimized tokenizer with numba implementation

Core Capabilities

  • Bilingual processing in English and Japanese
  • Efficient text generation and completion
  • Code processing capabilities
  • Flexible deployment options through Hugging Face Transformers

Frequently Asked Questions

Q: What makes this model unique?

PLaMo 2 1B stands out for its hybrid architecture that combines Mamba SSM with sliding window attention, offering improved efficiency while maintaining high performance. Its bilingual capabilities and specialized training across multiple content types make it versatile for various applications.

Q: What are the recommended use cases?

The model is primarily designed for text generation tasks in both English and Japanese. However, it's important to note that it has NOT been instruction-tuned for chat dialog or other downstream tasks. Users should perform safety testing and tuning for their specific applications.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.