ICML Towards Generalizable Multimodal ECG Representation Learning with LLM-extracted Clinical Entities

Poster
in
Workshop: 1st Workshop on Foundation Models for Structured Data (FMSD)

Towards Generalizable Multimodal ECG Representation Learning with LLM-extracted Clinical Entities

Mingsheng Cai · Jiuming Jiang · Wenhao Huang · che liu · Rossella Arcucci

[ Abstract ] [ Project Page ]

[ OpenReview]

Abstract:

Electrocardiogram (ECG) recordings are essential for cardiac diagnostics but require large-scale annotation for supervised learning. In this work, we propose a supervised pre-training framework for multimodal ECG representation learning that leverages Large Language Model (LLM) based clinical entity extraction from ECG reports to build structured cardiac queries. By fusing ECG signals with standardized queries rather than categorical labels, our model enables zero-shot classification of unseen conditions. Experiments on six downstream datasets demonstrate competitive zero-shot AUC of 77.20\%, outperforming state-of-the-art self-supervised and multimodal baselines by 4.98\%. Our findings suggest that incorporating structured clinical knowledge via LLM-extracted entities leads to more semantically aligned and generalizable ECG representations than typical contrastive or generative objectives.

Chat is not available.

Poster in Workshop: 1st Workshop on Foundation Models for Structured Data (FMSD)

Towards Generalizable Multimodal ECG Representation Learning with LLM-extracted Clinical Entities

Mingsheng Cai · Jiuming Jiang · Wenhao Huang · che liu · Rossella Arcucci

Poster
in
Workshop: 1st Workshop on Foundation Models for Structured Data (FMSD)