ICML Poster MOGIC: Metadata-infused Oracle Guidance for Improved Extreme Classification

Poster

MOGIC: Metadata-infused Oracle Guidance for Improved Extreme Classification

Suchith Chidananda Prabhu · Bhavyajeet Singh · Anshul Mittal · Siddarth Asokan · Shikhar Mohan · Deepak Saini · Yashoteja Prabhu · Lakshya Kumar · Jian Jiao · Amit Singh · Niket Tandon · Manish Gupta · Sumeet Agarwal · Manik Varma

East Exhibition Hall A-B #E-1807

[ Abstract ] [ Lay Summary ]

[ Slides] [ Poster] [ OpenReview]

Tue 15 Jul 4:30 p.m. PDT — 7 p.m. PDT

Abstract:

Retrieval-augmented classification and generation models benefit from early-stage fusion of high-quality text-based metadata, often called memory, but face high latency and noise sensitivity. In extreme classification (XC), where low latency is crucial, existing methods use late-stage fusion for efficiency and robustness. To enhance accuracy while maintaining low latency, we propose MOGIC, a novel approach to metadata-infused oracle guidance for XC. We train an early-fusion oracle classifier with access to both query-side and label-side ground-truth metadata in textual form and subsequently use it to guide existing memory-based XC disciple models via regularization. The MOGIC algorithm improves precision@1 and propensity-scored precision@1 of XC disciple models by 1-2% on six standard datasets, at no additional inference-time cost. We show that MOGIC can be used in a plug-and-play manner to enhance memory-free XC models such as NGAME or DEXA. Lastly, we demonstrate the robustness of the MOGIC algorithm to missing and noisy metadata. The code is publicly available at https://github.com/suchith720/mogic.

Lay Summary:

In a text classification task, given a query, we aim to predict the labels relevant to that query. In many cases, additional information about the query or the label is available (called metadata or memory) and can be leveraged to make better predictions. Currently popular methods, such as RAG, which retrieve this metadata, and augment it with the model input for classification or generation, tend to have high latency and are sensitive to noise.We propose a two-stage technique to utilize this extra information while also meeting the latency constraints. First, we train a powerful oracle model that takes advantage of the metadata assuming a best-case scenario where this information is also available during inference. Then, we train an off-the-shelf classifier model as a disciple that learns to mimic the behaviour of the oracle, but under a more challenging scenario, wherein the metadata is not know apriori. In this way, we get the best of both worlds — A model that is both efficient and fast, while benefiting from the improved accuracy gained by learning from the powerful oracle model.We observe that this training technique (which we call Metadata-infused Oracle Guidance for Improved Extreme Classification, or MOGIC) improves overall accuracy of any existing classifier, while also offering a novel approach to incorporating extra information into the classification settings.

Chat is not available.