stories15M_MOE

Maintained By
ggml-org

stories15M_MOE

PropertyValue
Model TypeMixture of Experts (MoE)
Base ModelTinyLlama-15M-stories
Number of Experts4
SourceHuggingFace

What is stories15M_MOE?

stories15M_MOE is an experimental Mixture of Experts (MoE) model created by replicating the TinyLlama-15M-stories model four times to create distinct expert networks. This architecture is primarily designed for testing purposes and story generation, featuring randomly initialized router weights to direct input to appropriate experts.

Implementation Details

The model is built upon the TinyLlama architecture, implementing a unique MoE approach by utilizing four identical copies of the base model as expert networks. A notable feature is its Shakespeare LoRA adapter, trained on the first 100 paragraphs of Shakespeare's works, enabling the model to generate text in both modern and Shakespearean styles.

  • Four expert networks derived from TinyLlama-15M-stories
  • Random router weight initialization
  • Includes specialized Shakespeare LoRA adapter
  • Optimized for story generation tasks

Core Capabilities

  • Story generation and narrative creation
  • Dual-style text generation (modern and Shakespearean)
  • Experimental text generation with router-based expert selection
  • Lightweight implementation suitable for testing environments

Frequently Asked Questions

Q: What makes this model unique?

The model's unique feature is its experimental MoE architecture that uses four identical experts from TinyLlama, combined with a specialized Shakespeare LoRA adapter, enabling diverse text generation capabilities despite its small size.

Q: What are the recommended use cases?

The model is primarily intended for testing and experimental purposes, particularly suitable for bedtime story generation or creative writing applications. It's not recommended for production use except in specific story-telling applications.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.