SeaLLM-13B-Chat

Maintained By
SeaLLMs

SeaLLM-13B-Chat

PropertyValue
Base ModelLlama-2
Languages Supported10 (including Vietnamese, Indonesian, Thai, Chinese, Khmer, Lao, Burmese, Malay, Tagalog, English)
LicenseSeaLLMs License
PaperTechnical Report

What is SeaLLM-13B-Chat?

SeaLLM-13B-Chat is a specialized large language model designed specifically for Southeast Asian languages. Built upon Llama-2, it has been extensively pre-trained and fine-tuned to excel in 10 different languages, with particular emphasis on non-Latin script languages like Thai, Khmer, Lao, and Burmese. The model stands out for its cultural adaptation and superior performance compared to ChatGPT-3.5 in many Southeast Asian languages.

Implementation Details

The model implements several innovative technical approaches, including a specialized vocabulary expansion that reduced token compression ratios significantly (e.g., Thai text compression improved from 4.29x to 1.57x). The training process involved multiple stages of pre-training, supervised fine-tuning, and self-preferencing DPO (Direct Preference Optimization).

  • Expanded vocabulary with ~16K new tokens for SEA languages
  • Multi-stage training process with dynamic data mixture control
  • Culturally-adapted safety measures and local compliance
  • Enhanced tokenization efficiency for non-Latin scripts

Core Capabilities

  • Outperforms ChatGPT-3.5 in non-Latin Southeast Asian languages
  • Superior performance in M3Exam benchmark across multiple languages
  • Enhanced cultural understanding and local context awareness
  • Improved machine translation capabilities for low-resource languages
  • Strong safety measures aligned with local cultural norms and regulations

Frequently Asked Questions

Q: What makes this model unique?

SeaLLM-13B-Chat's unique strength lies in its specialized optimization for Southeast Asian languages, particularly non-Latin scripts, while maintaining strong performance in English. It demonstrates superior cultural adaptation and local compliance compared to western-built LLMs.

Q: What are the recommended use cases?

The model is ideal for applications requiring deep understanding of Southeast Asian languages and cultures, including translation, content generation, and educational assistance. It's particularly effective for tasks involving Thai, Khmer, Lao, and Burmese languages, where it shows significant advantages over existing models.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.