MOSS-moon-003-sft-plugin
Property | Value |
---|---|
Model Size | 16B parameters |
License | AGPL-3.0 |
Base Paper | CodeGen Paper |
Languages | English, Chinese |
What is moss-moon-003-sft-plugin?
MOSS-moon-003-sft-plugin is an advanced conversational AI model developed by Fudan University that combines large language model capabilities with plugin integration. Built on a 16B parameter architecture, it's designed to handle both Chinese and English interactions while leveraging external tools through plugins.
Implementation Details
The model is built upon CodeGen architecture and has undergone extensive training on approximately 700B tokens (100B Chinese, 20B English). It implements supervised fine-tuning on 1.1M multi-turn conversations and 300K plugin-augmented interactions.
- Supports model quantization (INT4/INT8) for efficient deployment
- Requires 12GB-81GB GPU memory depending on precision
- Implements a plugin architecture for external tool integration
Core Capabilities
- Multi-tool integration (search engine, calculator, text-to-image, equation solver)
- Bilingual conversation handling (Chinese and English)
- Code generation and understanding
- Mathematical problem solving
- Safe and controlled responses with built-in harmlessness features
Frequently Asked Questions
Q: What makes this model unique?
MOSS distinguishes itself through its plugin architecture and bilingual capabilities, allowing seamless integration of external tools while maintaining high-quality conversations in both English and Chinese. The model's ability to run on consumer hardware through quantization is also notable.
Q: What are the recommended use cases?
The model excels in interactive scenarios requiring tool use, such as web search-augmented conversations, mathematical problem solving, and creative tasks involving text-to-image generation. It's particularly suited for applications requiring bilingual capabilities and safe, controlled responses.