AstroSage-8B

Maintained By
AstroMLab

AstroSage-8B

PropertyValue
Parameter Count8 billion
Base ModelMeta-Llama-3.1-8B
PaperarXiv:2411.09012
LicenseLlama 3.1 Community License
Training Data3.3B tokens (Pre-training), 2.0B tokens (Fine-tuning)

What is AstroSage-8B?

AstroSage-8B is a specialized language model designed specifically for astronomy, astrophysics, and cosmology research. Built on the Llama 3.1 architecture, it achieves remarkable performance that rivals GPT-4o while being significantly more cost-effective. The model has been trained on an extensive collection of astronomical literature, including 250,000 arXiv preprints and various astronomical resources.

Implementation Details

The model employs a sophisticated training approach combining Continued Pre-training (CPT) and Supervised Fine-tuning (SFT). It utilizes a novel model merging technique, combining 75% specialized training with 25% Meta-Instruct capabilities.

  • Architecture built on Meta-Llama-3.1-8B framework
  • Trained on ORNL OLCF Frontier infrastructure
  • Supports BF16 tensor operations
  • Implements advanced text generation capabilities

Core Capabilities

  • Achieves 80.9% accuracy on domain-specific tasks
  • Outperforms other 8B parameter models
  • Specialized in astronomical research assistance
  • Excellent at literature review and summarization
  • Supports educational applications in astronomy

Frequently Asked Questions

Q: What makes this model unique?

AstroSage-8B stands out for its specialized focus on astronomy and astrophysics, achieving performance comparable to GPT-4o while being 1000x more cost-effective. Its training on comprehensive astronomical literature makes it particularly effective for domain-specific tasks.

Q: What are the recommended use cases?

The model excels in curiosity-driven question answering, astronomical research assistance, educational support, literature review, and scientific concept explanation. However, it should not be used as the sole source for critical research decisions, and outputs should be verified against primary sources.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.