ICML Expo Demonstration Real-World Autonomy: Building Modular, Voice-Guided Embodied Agents with SLMs and Vision

Expo Demonstration

Real-World Autonomy: Building Modular, Voice-Guided Embodied Agents with SLMs and Vision

Aastha Varma · Sushant Moon · Jess Torres

West Exhibition Hall A-B1

[ Abstract ]

Mon 14 Jul 4 p.m. PDT — 7 p.m. PDT

Abstract:

We present a new approach to embodied intelligence—one grounded in modular AI systems - combining small language models (SLMs), vision models, and speech interfaces. This architecture enables fast, intuitive agent behavior—even in low-resource, real-world environments.

Our prototype, an AI-powered exoskeleton, performs physical tasks through natural human interaction. It operates in three modes: Shadow (mimic gestures), Command(respond to voice), and Training (learn by demonstration). High-level reasoning is handled by SLMs, while fast, modular controllers manage low-level control.

This approach removes the need for heavy simulations and makes it easier for engineers and researchers to build real-world systems with limited resources.

Chat is not available.

Schedule

Mon 4:00 p.m. - 7:00 p.m.

Real-World Autonomy: Building Modular, Voice-Guided Embodied Agents with SLMs and Vision ( Demonstration ) >

Aastha Varma 🔗