Skip to yearly menu bar Skip to main content


Poster
in
Affinity Workshop: LatinX in AI

The impact of on-policy parallelized data collection on network plasticity in deep reinforcement learning

Walter Mayor · Johan Obando-Ceron · Aaron Courville · Pablo Samuel Castro


Abstract:

The use of parallel actors for data collection hasbeen an effective technique used in reinforcementlearning (RL) algorithms. The manner in whichdata is collected in these algorithms, controlledvia the number of parallel environments and therollout length, induces a form of bias-variancetrade-off; the number of training passes over thecollected data, on the other hand, must strike abalance between sample efficiency and overfit-ting. We conduct an empirical analysis of thesetrade-offs on PPO, one of the most popular RLalgorithms that uses parallel actors, and establishconnections to network plasticity and, more gener-ally, optimization stability. We examine its impacton network architectures, as well as the hyper-parameter sensitivity when scaling data. Our anal-yses indicate that larger dataset sizes can increasefinal performance across a variety of settings, andthat scaling parallel environments is more effec-tive than increasing rollout lengths. These find-ings highlight the critical role of data collectionstrategies in improving agent performance.

Chat is not available.