Demonstration
in
Expo Demonstration: Real-World Autonomy: Building Modular, Voice-Guided Embodied Agents with SLMs and Vision
Real-World Autonomy: Building Modular, Voice-Guided Embodied Agents with SLMs and Vision
Aastha Varma
Abstract:
We present a new approach to embodied intelligence—one grounded in modular AI systems - combining small language models (SLMs), vision models, and speech interfaces. This architecture enables fast, intuitive agent behavior—even in low-resource, real-world environments.
Our prototype, an AI-powered exoskeleton, performs physical tasks through natural human interaction. It operates in three modes: Shadow (mimic gestures), Command(respond to voice), and Training (learn by demonstration). High-level reasoning is handled by SLMs, while fast, modular controllers manage low-level control.
Chat is not available.