Skip to yearly menu bar Skip to main content


Poster
in
Affinity Workshop: New In ML

Using Deep Learning and Large Language Models to Generate 3D Crystal Structures: An Algorithm Review


Abstract:

The discovery of crystalline materials has long been hindered by the inefficiencies of traditional trial-and-error methods and computationally intensive ab initio calculations, which struggle to navigate the vast configurational space of stable structures. This review examines how deep learning (DL) and large language models (LLMs) are revolutionizing 3D crystal structure generation by encoding physical constraints and chemical rules from data. Diffusion models, such as DiffCSP, enable up to 50× faster generation of stable structures while reproducing experimental space group distributions, while Crystal Diffusion Variational Autoencoders (CDVAEs) reduce energy differences from ground states by 68.1 meV/atom. LLMs like CrystaLLM leverage autoregressive modeling of Crystallographic Information Files (CIFs) to generate plausible inorganic structures unseen during training, while MatterGPT uses string-based SLICES representations for property-targeted inverse design. Key advancements include symmetry-aware graph networks (e.g., ComFormer) for preserving crystallographic invariances, multimodal frameworks like CrysMMNet fusing structural and textual information, and conditional diffusion models (e.g., Con-CDVAE) for tailoring properties like bandgap and formation energy. Applications span inorganic semiconductors and metal-organic frameworks (MOFs), with GHP-MOFassemble generating MOFs with CO₂ capacities exceeding 96.9% of conventional datasets. Performance benchmarks like CSPBench highlight DL’s superiority in structural validity (91.93% force validity) and diversity (Fréchet Distance 0.152), though DFT validation remains critical. As these methods mature, they bridge computation and experiment, promising accelerated discovery of functional materials for energy and electronics.

Chat is not available.