Poster
Persistent Topological Features in Large Language Models
Yuri Gardinazzi · Karthik Viswanathan · Giada Panerai · Alessio Ansuini · Alberto Cazzaniga · Matteo Biagetti
East Exhibition Hall A-B #E-2812
Large language models (LLMs), like those behind popular AI chatbots, can generate impressively human-like responses to text prompts, but how they actually process information inside remains mostly unknown. This lack of transparency, given the widespread use of these models in more important tasks, raises serious concerns in the scientific, and broader, community. Researchers also want to make these large models smaller and less resource-intensive, without losing their effectiveness.To tackle both issues, our work brings in tools from mathematics, specifically “topological data analysis,” which is good at describing complex shapes and relationships in data. We apply a mathematical approach called zigzag persistence to track how information evolves across each layer of an LLM, instead of looking at each layer separately. This lets us measure how groups of data points change and interact through the whole model. With this method, we’re able to spot different “phases” in how the model processes language inputs, and we can use our findings to suggest which model layers could be removed (pruned) to compress the model—without major performance loss.Our approach works on different models and datasets, offering a new window into how LLMs actually work, and paving the way to safer, more efficient AI systems.