ICML Poster Targeted Unlearning with Single Layer Unlearning Gradient

Poster

Targeted Unlearning with Single Layer Unlearning Gradient

Zikui Cai · Yaoteng Tan · M. Salman Asif

East Exhibition Hall A-B #E-2204

[ Abstract ] [ Lay Summary ] [ Project Page ]

[ Slides] [ Poster] [ OpenReview]

Tue 15 Jul 4:30 p.m. PDT — 7 p.m. PDT

Abstract:

Machine unlearning methods aim to remove sensitive or unwanted content from trained models, but typically demand extensive model updates at significant computational cost while potentially degrading model performance on both related and unrelated tasks. We propose Single Layer Unlearning Gradient (SLUG) as an efficient method to unlearn targeted information by updating a single critical layer using a one-time gradient computation. SLUG uses layer importance and gradient alignment metrics to identify the optimal layer for targeted information removal while preserving the model utility. We demonstrate the effectiveness of SLUG for CLIP, Stable Diffusion, and vision-language models (VLMs) in removing concrete (e.g., identities and objects) and abstract concepts (e.g., artistic styles). On the UnlearnCanvas benchmark, SLUG achieves comparable unlearning performance to existing methods while requiring significantly less computational resources. Our proposed approach offers a practical solution for targeted unlearning that is computationally efficient and precise. Our code is available at https://github.com/CSIPlab/SLUG

Lay Summary:

Modern generative models can create misinformation through celebrity impersonation, unauthorized copying of artwork, or misuse of styles. To tackle this, researchers use machine unlearning—removing certain knowledge while keeping the model's original power. However, existing methods typically need many updates and repeated calculations, which are computationally costly.We propose Single Layer Unlearning Gradient (SLUG), a highly efficient alternative. Instead of retraining the whole model, SLUG finds one key layer of model to update using a smart metric that identifies where the targeted information is stored. With just a single gradient calculation and a one-step update, SLUG removes the unwanted information while preserving the model’s ability to perform other tasks.We show that SLUG works well on popular models like CLIP and Stable Diffusion, effectively forgetting specific identities, objects, or styles with far less effort than traditional methods. Our code is publicly available at https://github.com/CSIPlab/SLUG

Chat is not available.