Llama-3.1-Swallow-8B-Instruct-v0.1

Llama-3.1-Swallow-8B-Instruct-v0.1

tokyotech-llm

An 8B parameter Japanese-enhanced LLaMA 3.1 model, fine-tuned for instruction following with improved bilingual capabilities and strong performance on Japanese NLP tasks

PropertyValue
Parameter Count8.03B
Model TypeLLaMA Architecture
LicenseMETA LLAMA 3.1 COMMUNITY LICENSE & Gemma Terms of Use
LanguagesJapanese, English
PaperLLaMA 3 Paper

What is Llama-3.1-Swallow-8B-Instruct-v0.1?

Llama-3.1-Swallow-8B-Instruct is an advanced language model that enhances the Japanese language capabilities of Meta's LLaMA 3.1 while maintaining strong English performance. It was developed through continual pre-training using approximately 200 billion tokens from Japanese web corpus, Wikipedia articles, and specialized content.

Implementation Details

The model underwent extensive training using the Megatron-LM framework and was fine-tuned on carefully curated instruction datasets. It leverages both synthetic and human-curated data to ensure high-quality responses in both Japanese and English contexts.

  • Built on LLaMA 3.1 architecture with 8B parameters
  • Trained on Swallow Corpus Version 2 and multilingual content
  • Supports both Japanese and English instruction following
  • Implements advanced tokenization for efficient processing

Core Capabilities

  • Strong performance in Japanese NLP tasks (achieving top scores in multiple benchmarks)
  • Maintains competitive English language capabilities
  • Excels in tasks like translation, summarization, and question-answering
  • Specialized instruction-following abilities in both languages

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its enhanced Japanese language capabilities while maintaining strong English performance, achieving state-of-the-art results in various Japanese NLP benchmarks while preserving LLaMA 3.1's English capabilities.

Q: What are the recommended use cases?

The model is well-suited for bilingual applications including translation, summarization, question-answering, and general instruction following in both Japanese and English contexts. It's particularly effective for tasks requiring deep understanding of Japanese language and culture.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026