Poster
DVI:A Derivative-based Vision Network for INR
RUNZHAO YANG · Xiaolong Wu · Zhihong Zhang · Fabian Zhang · Tingxiong Xiao · Zongren Li · Kunlun He · Jinli Suo
West Exhibition Hall B2-B3 #W-221
Recent advancements in computer vision have seen Implicit Neural Representations (INR) becoming a dominant representation form for data due to their compactness and expressive power. To solve various vision tasks with INR data, vision networks can either be purely INR-based, but are thereby limited by simplistic operations and performance constraints, or include raster-based methods, which then tend to lose crucial structural information of the INR during the conversion process. To address these issues, we propose DVI, a novel Derivative-based Vision network for INR, capable of handling a variety of vision tasks across various data modalities, while achieving the best performance among the existing methods by incorporating state of the art raster-based methods into a INR based architecture. DVI excels by extracting semantic information from the high order derivative map of the INR, then seamlessly fusing it into a pre-existing raster-based vision network, enhancing its performance with deeper, task-relevant semantic insights. Extensive experiments on five vision tasks across three data modalities demonstrate DVI's superiority over existing methods. Additionally, our study encompasses comprehensive ablation studies to affirm the efficacy of each element of DVI, the influence of different derivative computation techniques and the impact of derivative orders. Reproducible codes are provided in the supplementary materials.
Computer vision systems struggle to effectively process Implicit Neural Representations (INR) data, either using limited INR-based methods or losing crucial structural information when converting to traditional formats.We developed DVI, a Derivative-based Vision network that extracts structural information from high-order derivatives of INR data and seamlessly integrates it with existing vision networks.DVI handles multiple vision tasks across various data types with superior performance. This advancement helps computers better "understand" complex visual data, potentially improving applications from medical imaging to autonomous driving.