Poster
Reidentify: Context-Aware Identity Generation for Contextual Multi-Agent Reinforcement Learning
Zhiwei XU · Kun Hu · Xin Xin · Weiliang Meng · Yiwei Shi · Hangyu Mao · Bin Zhang · dapeng Li · Jiangjin Yin
West Exhibition Hall B2-B3 #W-712
Generalizing multi-agent reinforcement learning (MARL) to accommodate variations in problem configurations remains a critical challenge in real-world applications, where even subtle differences in task setups can cause pre-trained policies to fail. To address this, we propose Context-Aware Identity Generation (CAID), a novel framework to enhance MARL performance under the Contextual MARL (CMARL) setting. CAID dynamically generates unique agent identities through the agent identity decoder built on a causal Transformer architecture. These identities provide contextualized representations that align corresponding agents across similar problem variants, facilitating policy reuse and improving sample efficiency. Furthermore, the action regulator in CAID incorporates these agent identities into the action-value space, enabling seamless adaptation to varying contexts. Extensive experiments on CMARL benchmarks demonstrate that CAID significantly outperforms existing approaches by enhancing both sample efficiency and generalization across diverse context variants.
In many real-world scenarios, teams of AI agents need to work together in changing environments—for example, robots cooperating in different room layouts or traffic lights adjusting to different intersections. However, most current AI training methods for teams of agents struggle when even small details in the environment change. Our work introduces a new approach called Context-Aware Identity Generation (CAID) that helps AI agents better adapt to such changes. CAID gives each agent a unique identity based on the situation they are in, allowing them to understand their roles more clearly and act more effectively as a team. We tested this method in several challenging simulation environments, and found that CAID helps agents learn faster and perform more reliably across a wide range of situations.