ICML Poster LipsNet++: Unifying Filter and Controller into a Policy Network

Spotlight Poster

LipsNet++: Unifying Filter and Controller into a Policy Network

Xujie Song · Liangfa Chen · Tong Liu · Wenxuan Wang · Yinuo Wang · Shentao Qin · Yinsong Ma · Jingliang Duan · Shengbo Li

West Exhibition Hall B2-B3 #W-704

[ Abstract ] [ Lay Summary ] [ Project Page ]

[ Poster] [ OpenReview]

Wed 16 Jul 11 a.m. PDT — 1:30 p.m. PDT

Abstract:

Deep reinforcement learning (RL) is effective for decision-making and control tasks like autonomous driving and embodied AI. However, RL policies often suffer from the action fluctuation problem in real-world applications, resulting in severe actuator wear, safety risk, and performance degradation. This paper identifies the two fundamental causes of action fluctuation: observation noise and policy non-smoothness. We propose LipsNet++, a novel policy network with Fourier filter layer and Lipschitz controller layer to separately address both causes. The filter layer incorporates a trainable filter matrix that automatically extracts important frequencies while suppressing noise frequencies in the observations. The controller layer introduces a Jacobian regularization technique to achieve a low Lipschitz constant, ensuring smooth fitting of a policy function. These two layers function analogously to the filter and controller in classical control theory, suggesting that filtering and control capabilities can be seamlessly integrated into a single policy network. Both simulated and real-world experiments demonstrate that LipsNet++ achieves the state-of-the-art noise robustness and action smoothness. The code and videos are publicly available at https://xjsong99.github.io/LipsNet_v2.

Lay Summary:

Deep reinforcement learning (RL) systems used in autonomous vehicles and robots often produce unstable, jittery actions due to noisy sensor data and overly sensitive decision-making algorithms, causing accelerated hardware wear and safety risks. In response, we introduce LipsNet++, a unified policy architecture inspired by classical control theory that embeds an adaptive filtering stage—analogous to noise-canceling headphones removing spurious signals—and a Lipschitz smoothing stage—akin to shock absorbers damping abrupt motions. Experimental validation across simulated and physical platforms shows LipsNet++ substantially reduces action fluctuations compared to standard deep RL policies. By reducing action fluctuation, LipsNet++ enhances the robustness, reliability and lifespan of AI systems operating in complex, unpredictable environments.

Chat is not available.