Poster
RobustZero: Enhancing MuZero Reinforcement Learning Robustness to State Perturbations
Yushuai Li · Hengyu Liu · Torben Pedersen · Yuqiang He · Kim Larsen · Lu Chen · Christian Jensen · Jiachen Xu · TIANYI LI
West Exhibition Hall B2-B3 #W-700
MuZero, a recent reinforcement learning (RL) method, has achieved remarkable success in games, surpassing human performance. Its strengths have enabled its adoption in real-world decision-making tasks, e.g., autonomous driving and voltage control. However, in these settings, systems often encounter state perturbations—errors in input states caused by sensor noise or malicious attacks. These perturbations can mislead the MuZero agent, leading to suboptimal or unsafe decisions.To address this challenge, we propose RobustZero, a novel robust RL method that extends MuZero to defend against state perturbations. RobustZero incorporates the contrastive learning and an adaptive adjustment mechanism to produce consistent and robust policies before and after perturbations. Notably, RobustZero consistently outperforms existing methods, particularly in environments with noisy or adversarial inputs.Our results highlight the importance of robustness in RL and provide insights into designing agents that remain reliable even under imperfect observations.