ICML AdaptMI: Adaptive Skill-based In-context Math Instructions for Small Language Models

Poster
in
Workshop: Methods and Opportunities at Small Scale (MOSS)

AdaptMI: Adaptive Skill-based In-context Math Instructions for Small Language Models

Yinghui He · Abhishek Panigrahi · Yong LIN · Sanjeev Arora

Keywords: [ large language models ] [ Small language models ] [ test-time adaption ] [ in-context learning ] [ natural language processing ]

[ Abstract ] [ Project Page ]

[ OpenReview]

Abstract:

In-context learning (ICL) enhances language model performance by providing relevant contextual information.Recent works (Didolkar et al., 2024a;b) show that ICL performance can be improved by leveraging a frontier largelanguage model’s (LLM) ability to predict required skills to solve a problem, popularly referred to as an LLM’smetacognition, and using the recommended skills to construct necessary in-context examples. While this improvesperformance in larger models, smaller language models (SLMs) see minimal benefit, revealing a performance gap.We show that skill-based prompting can hurt SLM performance on easy questions by introducing unnecessaryinformation, akin to cognitive overload. To mitigate this, we introduce AdaptMI, an Adaptive strategy forselecting skill-based Math Instructions. Guided by cognitive load theory, AdaptMI introduces skill-basedexamples only when the model performs poorly. We further propose AdaptMI+ , which provides targetedexamples for specific missing skills. In 5-shot evaluations on popular math benchmarks and five SLMs (1B–7B;Qwen, Llama), AdaptMI+ improves accuracy by up to 6% compared to naive skill-based methods.

Chat is not available.

Poster in Workshop: Methods and Opportunities at Small Scale (MOSS)

AdaptMI: Adaptive Skill-based In-context Math Instructions for Small Language Models

Yinghui He · Abhishek Panigrahi · Yong LIN · Sanjeev Arora

Poster
in
Workshop: Methods and Opportunities at Small Scale (MOSS)