Skip to yearly menu bar Skip to main content


Poster
in
Workshop: AI Heard That! ICML 2025 Workshop on Machine Learning for Audio

The Sound of Syntax: Finetuning and Comprehensive Evaluation of Language Models for Speech Pathology

Fagun Patel · Duc Nguyen · Sang Truong · Jody Vaynshtok · Sanmi Koyejo · Nick Haber


Abstract:

About 5–9% of children in the U.S. have speech disorders requiring clinical care, yet certified speech-language pathologists (SLPs) are outnumbered by affected children nearly 20 to one. This care gap underscores the need for automation in SLP workflows. Current AI solutions focus on isolated tasks, limiting performance and increasing deployment costs, while scarce annotated data further hinders progress. Recent advances in multimodal large language models (LLMs), especially speech LLMs, present new opportunities to automate core SLP tasks and generate high-quality datasets, though their potential remains underexplored. In this study, we introduce the first unified benchmark for five key SLP tasks—disorder screening, transcription, disorder-type classification, symptom identification, and transcript-based classification—along with a cross-task finetuning strategy that enhances model performance. Evaluating 15 state-of-the-art LLMs, we find that transcription-based finetuning significantly improves results across tasks, with gains exceeding 30% over baselines. We release our datasets, models, and framework to support further research.

Chat is not available.