Poster
in
Affinity Workshop: New In ML
DFYP: Dynamic CNN and ViT Fusion with Adaptive Sobel-Convolution for Enhanced Crop Yield Prediction
Zeyu Yan · Juli Zhang · Jing Zhang · Ma Zhong · Qiguang Miao · Quan Wang
Accurate crop yield prediction remains challenging due to complex environmental variability and multi-scale spatial patterns. We propose DFYP, a dynamic fusion framework that adaptively integrates CNN and Vision Transformer (ViT) features to capture both local textures and global spatial dependencies. To enhance robustness in remote sensing scenarios, DFYP incorporates a learnable Sobel-based edge operator and improved Squeeze-and-Excitation (SE) blocks for spectral-aware channel re-weighting. Our model dynamically adjusts fusion weights during training, enabling better adaptation to varying image resolutions and crop types. Experiments on MODIS and Sentinel-2 datasets demonstrate that DFYP consistently outperforms state-of-the-art baselines across all evaluation metrics. These results confirm the model’s strong generalization ability and its effectiveness in handling both fine-grained and high-resolution satellite imagery.