gemma2-9b-cpt-sahabatai-v1-base

gemma2-9b-cpt-sahabatai-v1-base

GoToCompany

A 9B parameter multilingual LLM optimized for Indonesian, Javanese, and Sundanese languages, with strong performance across regional benchmarks.

PropertyValue
Parameter Count10.2B
LanguagesEnglish, Indonesian, Javanese, Sundanese
Context Length8192 tokens
LicenseGemma Community License
Training Data50B tokens

What is gemma2-9b-cpt-sahabatai-v1-base?

Sahabat-AI v1 base is a continued pre-trained language model built on the Gemma2 9B architecture, specifically optimized for Indonesian and regional languages. Co-initiated by GoTo Group and Indosat Ooredoo Hutchison, this model represents a significant advancement in multilingual AI capabilities for Southeast Asian languages.

Implementation Details

The model was trained using MosaicML Composer on 32 Nvidia H100 80GB GPUs over 7 days. It implements bfloat16 precision and uses a decoupled AdamW optimizer with weight stable decay scheduling. The training process involved a learning rate of 1.0e-5 and a global batch size of 256.

  • Achieves state-of-the-art performance of 64.123% on overall regional language tasks
  • Trained on a diverse dataset including Dolma Refined Web, Stack V2, and specialized regional language corpora
  • Implements advanced tokenization using the Gemma-2-9B tokenizer

Core Capabilities

  • Exceptional performance in Indonesian (60.040%), Javanese (69.882%), and Sundanese (62.446%) language tasks
  • Strong multilingual understanding and generation capabilities
  • Maintains competitive performance on English tasks with 19.62% average score
  • Supports context length of up to 8192 tokens

Frequently Asked Questions

Q: What makes this model unique?

The model's primary strength lies in its exceptional performance across Indonesian and regional languages, significantly outperforming other models in Javanese and Sundanese language tasks while maintaining strong capabilities in Indonesian and English.

Q: What are the recommended use cases?

The model is particularly well-suited for applications requiring deep understanding of Indonesian, Javanese, and Sundanese languages, including text analysis, content generation, and language processing tasks in these languages. However, as a base model, it requires additional safety fine-tuning for production use.

Related Models

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026