Poster
One-dimensional Path Convolution
Xuanshu Luo · Martin Werner
East Exhibition Hall A-B #E-3303
Two-dimensional (2D) convolutional kernels have dominated convolutional neural networks (CNNs) in image processing. While linearly scaling 1D convolution provides parameter efficiency, its naive integration into CNNs disrupts image locality, thereby degrading performance. This paper presents path convolution (PathConv), a novel CNN design exclusively with 1D operations, achieving ResNet-level accuracy using only 1/3 parameters. To obtain locality-preserving image traversal paths, we analyze Hilbert/Z-order paths and expose a fundamental trade-off: improved proximity for most pixels comes at the cost of excessive distances for other sacrificed ones to their neighbors. We resolve this issue by proposing path shifting, a succinct method to reposition sacrificed pixels. Using the randomized rounding algorithm, we show that three shifted paths are sufficient to offer better locality preservation than trivial raster scanning. To mitigate potential convergence issues caused by multiple paths, we design a lightweight path-aware channel attention mechanism to capture local intra-path and global inter-path dependencies. Experimental results further validate the efficacy of our method, establishing the proposed 1D PathConv as a viable backbone for efficient vision models.
Modern computer vision systems use complex 2D operations to analyze images, making them computationally expensive and resource-intensive. While 1D operations could be more efficient, they typically disrupt the natural neighborhood relationships between pixels, leading to poor performance.We developed Path Convolution (PathConv), a novel approach that processes images using only 1D operations. We preserve local pixel relationships by carefully designing special pixel traversal paths and implementing a path-shifting technique. We adopt randomized rounding algorithms to show that just three carefully shifted paths provide better locality preservation than traditional raster scanning methods. We also created a lightweight attention mechanism that helps the system understand relationships within and between these paths.Our PathConv models achieve a similar level of accuracy as standard methods (ResNet) while using only one-third of model parameters, which makes image processing more accessible for devices with limited computing power, potentially enabling more efficient computer vision capabilities on smartphones, IoT devices, and other resource-constrained systems.