Poster
in
Workshop: ES-FoMo III: 3rd Workshop on Efficient Systems for Foundation Models
Training-Free Semantic Deferrals for Open-Ended LLM Cascades
Duncan Soiffer · Steven Kolawole · Virginia Smith
Abstract:
Existing cascade systems struggle with open-ended text generation due to evaluation challenges where multiple valid outputs exist without ground truth references. We propose using semantic agreement between multiple model outputs as a training-free deferral signal and evaluate semantic similarity metrics against token-level confidence across translation, summarization, question answering, and reading comprehension tasks. We show that semantic signals provide a stronger indication of when deferral is appropriate than token-level methods and are resilient to heterogeneous model quality.
Chat is not available.