Poster
in
Workshop: Programmatic Representations for Agent Learning
Discovering Logic-Informed Intrinsic Rewards to Explain Human Policies
Chengzhi Cao · Yinghao Fu · Chao Yang · Shuang Li
In high-stakes fields like healthcare, it is crucial to distill valuable strategic insights from expert clinicians. This paper focuses on extracting these knowledge-based insights from demonstrations provided by experts, where we represent the knowledge as a set of logical rules. Our learning framework is built upon the classic Inverse Reinforcement Learning (IRL). We assume that experts, like clinicians, are rational, and the treatments they choose are the best choices based on their logical understanding of the situation. Our algorithm can automatically extract these logical rules from their demonstrations. We introduce a neural logic tree generator, which is trained to generate logical statements step by step, starting from the goal and working backward. This mirrors the way humans engage in backward reasoning. Similarly, we interpret policy planning as a forward reasoning process, where the optimal policy is determined by finding the best path forward based on the provided rules. The neural logic tree generator and the policy are learned using the IRL until convergence. This process ultimately leads to the discovery of the most effective strategic rules. As a bonus, our algorithm also allows us to recover the reward function. In our experiments, we demonstrate that our method excels at discovering meaningful logical rules, particularly in the context of healthcare.