ICML Backdoor Defense via Boundary Reconstruction

Poster
in
Affinity Workshop: New In ML

Backdoor Defense via Boundary Reconstruction

Hengrui Yan · Xinran Zheng · Shuo Yang · Xingjun Wang

[ Abstract ] [ Project Page ]

[ OpenReview]

Abstract:

The widespread use of third-party and open-source models has heightened the risk of backdoor attacks, where triggers distort model behavior by creating misleading shortcuts. Existing backdoor unlearning methods mitigate these risks but remain vulnerable to re-injection attacks, enabling adversaries to launch similar backdoor attacks. We identify label transferability and residual shortcuts as the root causes of this vulnerability. To address this, we propose BDBR (Backdoor Defense via Boundary Reconstruction), which prevents adversaries from replaying original shortcuts by isolating benign and poisoned labels and reconstructing the decision boundary. BDBR involves two steps: introducing a shadow class to remap poisoned samples and pruning this class to cut off backdoor shortcuts. Experiments on benchmark datasets show that BDBR achieves state-of-the-art backdoor defense, particularly excelling in resisting re-injection attacks.

Chat is not available.

Poster in Affinity Workshop: New In ML

Backdoor Defense via Boundary Reconstruction

Hengrui Yan · Xinran Zheng · Shuo Yang · Xingjun Wang

Poster
in
Affinity Workshop: New In ML