Poster
in
Affinity Workshop: New In ML
ClinicalFMamba: Mamba-based Multimodal Medical Image Fusion for Enhanced Clinical Diagnosis
Meng Zhou · Farzad Khalvati
Abstract:
Multimodal medical image fusion integrates complementary information from different imaging modalities to enhance diagnostic accuracy and treatment planning. While deep learning methods have advanced fusion performance, existing approaches face critical limitations: CNNs excel at local feature extraction but struggle to model global context effectively, while Transformers achieve superior long-range modeling at the cost of quadratic computational complexity $O(N^2)$ in self-attention mechanisms, limiting clinical deployment. Recent State Space Models (SSMs) offer a promising alternative, enabling efficient long-range dependency modeling in linear time through selective mechanisms. Despite these advances, clinical validation of fused images remains underexplored. In this work, we propose ClinicalFMamba, a novel end-to-end CNN-Mamba hybrid architecture that synergistically combines local and global feature modeling. Our approach introduces: Dilated Gated Convolution Blocks for hierarchical multiscale feature extraction, and a latent Mamba module that efficiently captures long-range spatial dependencies between feature regions and enabling cross-modal fusion in latent space. Comprehensive evaluations on three datasets demonstrate the superior fusion performance across multiple quantitative metrics while achieving real-time fusion. Notably, we validate the clinical utility of our approach on the downstream brain tumor classification, achieving up to 7% improvements on the AUC score. Our method establishes a new paradigm for efficient multimodal medical image fusion suitable for real-time clinical deployment.
Chat is not available.