Poster
G-Sim: Generative Simulations with Large Language Models and Gradient-Free Calibration
Samuel Holt · Max Ruiz Luyten · Antonin Berthon · Mihaela van der Schaar
East Exhibition Hall A-B #E-1400
Constructing robust simulators is essential for asking "what if?" questions and guiding policy in critical domains like healthcare and logistics. However, existing methods often struggle, either failing to generalize beyond historical data or, when using Large Language Models (LLMs), suffering from inaccuracies and poor empirical alignment. We introduce G-Sim, a hybrid framework that automates simulator construction by synergizing LLM-driven structural design with rigorous empirical calibration. G-Sim employs an LLM in an iterative loop to propose and refine a simulator's core components and causal relationships, guided by domain knowledge. This structure is then grounded in reality by estimating its parameters using flexible calibration techniques. Specifically, G-Sim can leverage methods that are both likelihood-free and gradient-free with respect to the simulator, such as gradient-free optimization for direct parameter estimation or simulation-based inference for obtaining a posterior distribution over parameters. This allows it to handle non-differentiable and stochastic simulators. By integrating domain priors with empirical evidence, G-Sim produces reliable, causally-informed simulators, mitigating data-inefficiency and enabling robust system-level interventions for complex decision-making.
Making smart decisions for complex systems, like managing hospital capacity or a company's supply chain, often requires asking "what if...?" questions. We rely on computer simulations to explore different scenarios, but building accurate ones is a major challenge. Current methods often fail in two ways: some are stuck in the past, unable to predict new situations they haven't seen in the data, while others that use creative AI like Large Language Models (LLMs) can be unreliable and invent details that don't match reality. Our work, G-Sim, introduces a hybrid approach that gets the best of both worlds. First, we use an LLM's vast knowledge to sketch a basic blueprint of the system, outlining its main parts and how they connect. Then, in a crucial second step, we use flexible algorithms to automatically tune this blueprint, adjusting all its specific numbers until the simulation's behavior accurately matches real-world data. If the simulation is still flawed, G-Sim identifies the problem and asks the LLM to propose a better blueprint, repeating the process until the model is right. This method results in trustworthy digital replicas of complex systems. It empowers decision-makers—from hospital administrators to city planners—to safely and reliably test the consequences of their choices before implementing them in the real world, leading to better, safer, and more informed strategies.