TeapotLLM
Property | Value |
---|---|
Parameter Count | ~800 Million |
Base Model | Flan-T5-Large |
Training Time | ~10 hours on A100 |
License | MIT |
Model URL | huggingface.co/teapotai/teapotllm |
What is TeapotLLM?
TeapotLLM is an innovative language model designed specifically for resource-constrained environments, offering efficient performance on CPUs and mobile devices. Fine-tuned from Flan-T5-large on synthetic data, it specializes in hallucination-resistant question answering and information extraction tasks. The model's unique architecture enables it to process queries while maintaining high accuracy and low resource consumption.
Implementation Details
The model was trained on a 10MB synthetic dataset generated using DeepSeek-V3, focusing on QA pairs with various task-specific formats. The training process, conducted on Google Colab's A100 GPU, carefully balanced task-specific performance against catastrophic forgetting. TeapotLLM integrates seamlessly with both custom TeapotAI library and Hugging Face's transformers library.
- Optimized for CPU inference with low latency
- Native support for JSON extraction via Pydantic models
- Built-in RAG capabilities with document embedding support
- Pickable model architecture for easy deployment
Core Capabilities
- Hallucination-resistant Question Answering
- Retrieval-Augmented Generation (RAG)
- Structured Information Extraction
- Conversational Response Generation
- Document-based Context Processing
Frequently Asked Questions
Q: What makes this model unique?
TeapotLLM stands out for its ability to run efficiently on resource-constrained devices while maintaining high accuracy in QA tasks. Its hallucination resistance and native JSON extraction capabilities make it particularly valuable for production environments requiring reliable information extraction.
Q: What are the recommended use cases?
The model excels in question answering applications, document processing, information extraction, and RAG implementations. It's particularly suited for scenarios requiring factual accuracy and structured data extraction from text. However, it's not recommended for code generation, creative writing, or critical decision applications.