Skywork-13B-base

Maintained By
Skywork

Skywork-13B-base

PropertyValue
Model Size13B parameters
Training Data3.2T tokens
Language Distribution52.2% English, 39.6% Chinese, 8% Code
LicenseSkywork Community License
PaperTechnical Report

What is Skywork-13B-base?

Skywork-13B-base is a state-of-the-art bilingual foundation model designed with a unique "thin and deep" architecture, featuring 52 layers compared to traditional 40-layer designs. The model demonstrates exceptional performance across multiple benchmarks, including C-Eval (60.6%), CMMLU (61.8%), and MMLU (62.1%).

Implementation Details

The model employs a modified architecture with 4,608 hidden dimensions and 12,288 FFN dimensions, utilizing a custom vocabulary of 65,536 tokens optimized for bilingual processing. The tokenizer implements Byte-Pair Encoding with specific allocations for Latin characters, Chinese characters, and specialized tokens.

  • 52 transformer layers (compared to traditional 40)
  • 4,608 hidden dimension size
  • 36 attention heads
  • Custom 65K vocabulary

Core Capabilities

  • Superior bilingual understanding and generation
  • Strong performance in technical and academic content
  • Efficient code processing capabilities
  • State-of-the-art performance in multiple benchmarks

Frequently Asked Questions

Q: What makes this model unique?

The model's distinctive "thin and deep" architecture with 52 layers and optimized dimensions, combined with its carefully curated multilingual training data, sets it apart from other models in its class.

Q: What are the recommended use cases?

The model excels in bilingual applications, academic content processing, technical documentation, and code-related tasks. It's particularly well-suited for applications requiring strong Chinese-English capabilities.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.