MatterGen
Property | Value |
---|---|
Parameter Count | 46.8M |
Model Type | Diffusion Model |
License | MIT |
Paper | Nature Publication |
Architecture | Based on GemNet |
What is MatterGen?
MatterGen is a sophisticated generative AI model developed by Microsoft Research AI for Science team, specifically designed for inorganic materials discovery. It employs diffusion modeling to jointly predict three crucial aspects of materials: atomic fractional coordinates, elemental composition, and unit cell lattice vectors. The model represents a significant advancement in computational materials science, capable of both unconditional generation and property-targeted material design.
Implementation Details
The model is built on a GemNet architecture and trained on high-quality datasets including MP and Alexandria, focusing on structures with up to 20 atoms and energy above hull below 0.1 eV/atom. Training utilized 8 NVIDIA A100 GPUs, with each epoch processing approximately 600K samples in 6 minutes.
- 46.8M trainable parameters
- Uses float32 precision
- Batch size of 512
- Adaptive learning rate from 1e-4 to 1e-6
Core Capabilities
- Generates novel inorganic material candidates without property conditions
- Supports fine-tuning on user-provided property-labeled materials
- Achieves 38.57% stable, unique, and novel (S.U.N.) structure generation rate
- Can target specific properties like bulk modulus and magnetic density
- Processes structures with up to 20 atoms in the unit cell
Frequently Asked Questions
Q: What makes this model unique?
MatterGen stands out for its ability to generate stable inorganic materials while targeting specific properties. It achieves remarkable performance in generating materials with extreme property values, such as 400 GPa bulk modulus, where only two such structures exist in the reference dataset.
Q: What are the recommended use cases?
The model is ideal for materials science research, particularly in discovering new inorganic materials with specific properties. It's recommended for generating structures with up to 20 atoms, excluding noble gases and radioactive elements. For property-guided generation, users should ensure sufficient training data (thousands of labeled structures) for optimal results.