Poster
in
Workshop: Methods and Opportunities at Small Scale (MOSS)
Evaluating Generalization and Representation Stability in Small LMs via Prompting, Fine-Tuning and Out-of-Distribution Prompts
Rahul Raja · Arpita Vats
Keywords: [ transformer models ] [ supervised fine-tuning ] [ representation learning ] [ generalization ] [ out-of-distribution robustness ] [ task-specific tuning ] [ model scaling ] [ small language models ] [ parameter efficiency ] [ experimental analysis ] [ few-shot prompting ] [ prompt engineering ] [ in-context learning ] [ model adaptation ] [ low-resource NLP ]
We investigate the generalization capabilities of small language models under two popular adaptation paradigms: few-shot prompting and supervised fine-tuning. While prompting is often favored for its parameter efficiency and flexibility, it remains unclear how robust this approach is in low-resource settings and under distributional shifts. This paper presents a comparative study of prompting and fine-tuning across task formats, prompt styles, and model scales, with a focus on their behavior in both in-distribution and out-of-distribution (OOD) settings.Beyond accuracy, we analyze the internal representations learned by each approach to assess the stability and abstraction of task-specific features. Our findings highlight critical differences in how small models internalize and generalize knowledge under different adaptation strategies. This work offers practical guidance for model selection in low-data regimes and contributes empirical insight into the ongoing debate over prompting versus fine-tuning.