Skip to yearly menu bar Skip to main content


Poster
in
Workshop: DataWorld: Unifying data curation frameworks across domains

Learning from the Best: Smoothness-Driven Metrics for Data Quality in Imitation Learning

Soham Kulkarni · Raayan Dhar · Yuchen Cui

Keywords: [ data quality ] [ behaviour cloning ] [ imitation learning ] [ robot learning ]


Abstract:

In imitation learning, the quality of demonstration data is critical to policy performance. However, real-world datasets often contain suboptimal, noisy, or inconsistent trajectories due to human error and procedural variability. Existing data filtering techniques typically require extensive manual annotation or substantial computational cost, limiting their scalability.We propose RINSE (Ranking and INdexing Smooth Examples), a general-purpose method for filtering demonstration data based on trajectory smoothness. RINSE evaluates each trajectory using both Cartesian and frequency-domain smoothness metrics, ranking and selecting high-quality demonstrations for training. This approach enhances policy robustness and reduces the amount of data needed for effective learning.We validate RINSE in both simulated imitation learning environments and on a real-world robotic platform. Our results demonstrate that RINSE enables more efficient and reliable policy learning from imperfect data. Furthermore, RINSE assigns dataset quality scores that correlate strongly with downstream performance and can be used to monitor the data collection process.

Chat is not available.