ICML Poster Towards Understanding Parametric Generalized Category Discovery on Graphs

Poster

Towards Understanding Parametric Generalized Category Discovery on Graphs

Bowen Deng · Lele Fu · Jialong Chen · Sheng Huang · Tianchi Liao · Zhang Tao · Chuan Chen

East Exhibition Hall A-B #E-1301

[ Abstract ] [ Lay Summary ]

[ OpenReview]

Wed 16 Jul 11 a.m. PDT — 1:30 p.m. PDT

Abstract:

Generalized Category Discovery (GCD) aims to identify both known and novel categories in unlabeled data by leveraging knowledge from old classes. However, existing methods are limited to non-graph data; lack theoretical foundations to answer When and how known classes can help GCD. We introduce the Graph GCD task; provide the first rigorous theoretical analysis of parametric GCD. By quantifying the relationship between old and new classes in the embedding space using the Wasserstein distance W, we derive the first provable GCD loss bound based on W. This analysis highlights two necessary conditions for effective GCD. However, we uncover, through a Pairwise Markov Random Field perspective, that popular graph contrastive learning (GCL) methods inherently violate these conditions. To address this limitation, we propose SWIRL, a novel GCL method for GCD. Experimental results validate our (theoretical) findings and demonstrate SWIRL's effectiveness.

Lay Summary:

In real-world networks like social media or recommendation systems, new types of users or items can appear that AI models have never seen before. We study how AI can automatically discover both known and unknown categories in such networks.We analyze how knowledge from known categories helps recognize new ones and show that many current methods are not well-suited for this task. Based on our findings, we develop a new method called SWIRL , which improves the ability of AI to identify unknown categories in complex network data.

Chat is not available.