Poster
in
Workshop: Scaling Up Intervention Models
TrialCalibre: A Fully Automated Causal Engine for RCT Benchmarking and Observational Trial Calibration
Amir Habibdoust Lafmajani · Xing Song
Real-world evidence (RWE) studies that emulate target trials increasingly inform regulatory and clinical decisions, yet residual, hard-to-quantify biases still limit their credibility. The recently proposed BenchExCal framework addresses this challenge via a two-stage Benchmark, Expand, Calibrate process, which first compares an observational emulation against an existing randomized controlled trial (RCT), then uses observed divergence to calibrate a second emulation for a new indication causal effect estimation. While methodologically powerful, BenchExCal is resource-intensive and difficult to scale.We introduce TrialCalibre, a conceptualized multi-agent system designed to automate and scale the BenchExCal workflow. Our framework features specialized agents—such as the Orchestrator, Protocol Design, Data Synthesis, Clinical Validation, and Quantitative Calibration Agents—that coordinate the the overall process. TrialCalibre incorporates agent learning (e.g., RLHF) and knowledge blackboards to support adaptive, auditable, and transparent causal effect estimation.