Poster
in
Workshop: The 2nd Workshop on Reliable and Responsible Foundation Models
Multi-Modal Medical Image Augmentation for Controlled Heterogeneity and Fair Outcomes
SOO YONG KIM · Seunghyeok Hong
Keywords: [ Fairness and Class Imbalance ] [ Sketch-Conditioned Diffusion ] [ Medical Image Augmentation ] [ Diversity Metric ]
Limited data in medical imaging exacerbate class imbalance and fairness gaps, undermining deep-learning across diverse patient subgroups. GAN- and diffusion-based augmenters can expand datasets but often lack precise control over multiple clinical attributes and fail to cover the full range of real-world variability. We introduce a four-step augmentation pipeline. First, an automated scoring function identifies which classes or regions most urgently need synthetic examples. Second, we construct sketch–image–text triplets from real scans, embedding age, sex, and disease labels. Third, we fine-tune a sketch-conditioned diffusion network for reliable sketch-to-image synthesis and boost variability by generating multiple, similarity-penalized sketches per case. Fourth, we propose a novel diversity metric that simultaneously measures semantic feature-space coverage and pixel-level dispersion—unlike FID or IS, it captures intra-class spread and boundary sharpness without human annotations. Experiments on chest X-rays show our pipeline delivers high-fidelity, diverse images aligned with user-specified conditions, substantially improving fairness and generalizability.