Poster
in
Workshop: Programmatic Representations for Agent Learning
How Robust Reinforcement Learning Enables Courier-Friendly Route Planning for Last-Mile Delivery?
Ziying Jia · Zeyu Dong · Miao Yin · Sihong He
Abstract:
Last-mile delivery (LMD) systems increasingly face dynamic customer demands that introduce uncertainty and lead to unstable delivery routes, reducing efficiency and placing cognitive burdens on couriers. To address this, we propose R$^3$S$^2$Route, a Robust Regularizer-enhanced RL-based Smooth and Stable Routing Algorithm that learns courier-friendly policies under state uncertainty. Our method adopts an actor-critic reinforcement learning framework and incorporates a robustness regularizer to penalize policy sensitivity to input perturbations. We formally define route smoothness and stability as courier-friendliness metrics, and integrate them into the learning framework to produce routing policies that are both geometrically intuitive and keep spatial-temporal consistent. Experimental results demonstrate that R$^3$S$^2$Route achieves up to 59.68\% improvement in route smoothness and 14.29\% in route stability, while maintaining low travel distances and time-window violation rates, outperforming several baselines in dynamic delivery environments.
Chat is not available.