Poster
Online Learning with Unknown Constraints
Karthik Sridharan · Seung Won Wilson Yoo
West Exhibition Hall B2-B3 #W-916
How can we design machine learning algorithms that are always safe, even when we don't fully know what 'safe' means at the start? Compared to previous works that study being safe on average, our work tackles the more challenging setting of always being safe. This is useful in settings where you have to be safe, such as preventing a robot from crashing or following laws and regulations. We built algorithms that perform well while also discovering what safe means despite noisy feedback. Our algorithm takes optimistic actions that are guaranteed to perform well and converts them into pessimistic actions that are guaranteed to be safe. We also came up with a rule - technically called a complexity measure - that captures the trade-off between performance and being safe. We also give some examples of our algorithm in action for useful settings.