ICML Poster FuseUNet: A Multi-Scale Feature Fusion Method for U-like Networks

Poster

FuseUNet: A Multi-Scale Feature Fusion Method for U-like Networks

Quansong He · Xiangde Min · Kaishen Wang · Tao He

West Exhibition Hall B2-B3 #W-212

[ Abstract ] [ Lay Summary ]

[ OpenReview]

Thu 17 Jul 4:30 p.m. PDT — 7 p.m. PDT

Abstract:

Medical image segmentation is a critical task in computer vision, with UNet serving as a milestone architecture. The typical component of UNet family is the skip connection, however, their skip connections face two significant limitations: (1) they lack effective interaction between features at different scales, and (2) they rely on simple concatenation or addition operations, which constrain efficient information integration. While recent improvements to UNet have focused on enhancing encoder and decoder capabilities, these limitations remain overlooked. To overcome these challenges, we propose a novel multi-scale feature fusion method that reimagines the UNet decoding process as solving an initial value problem (IVP), treating skip connections as discrete nodes. By leveraging principles from the linear multistep method, we propose an adaptive ordinary differential equation method to enable effective multi-scale feature fusion. Our approach is independent of the encoder and decoder architectures, making it adaptable to various U-Net-like networks. Experiments on ACDC, KiTS2023, MSD brain tumor, and ISIC2017/2018 skin lesion segmentation datasets demonstrate improved feature utilization, reduced network parameters, and maintained high performance. The code is available athttps://github.com/nayutayuki/FuseUNet.

Lay Summary:

Medical imaging helps doctors detect diseases by analyzing scans. A popular AI model for this task is called UNet. However, UNet often struggles to combine image details from different scales, which limits its accuracy.We propose a new method that treats this combination process like solving a mathematical problem step by step. This allows the model to better integrate information from different parts of the image, improving its understanding.Our solution works with many types of existing models, improves results across several datasets, and requires fewer computing resources — potentially making medical AI tools more efficient and accurate.

Chat is not available.