Poster
in
Workshop: 2nd Workshop on Test-Time Adaptation: Putting Updates to the Test (PUT)
Self-Generated In-Context Examples Improve LLM Agents for Sequential Decision-Making Tasks
Vishnu Sarukkai · Zhiqiang Xie · Kayvon Fatahalian
Large Language Model (LLM) agents typically require extensive task-specific knowledge engineering. Instead, we investigate a self-improvement approach where agents learn from their own successful experiences without human intervention. Our method builds and refines a database of self-generated trajectories for in-context learning. Even naive accumulation of successful trajectories yields substantial performance gains across ALFWorld (73\% to 89\%), Wordcraft (55\% to 64\%), and InterCode-SQL (75\% to 79\%). We further enhance performance with 1) database-level curation using population-based training, and 2) exemplar-level curation that selectively retains trajectories based on their utility. On ALFWorld, our method achieves 93\% success—surpassing approaches using more powerful LLMs. Our approach demonstrates that agents can autonomously improve through self-collected experience, without labor-intensive knowledge engineering.