Enhancing Exploration in Reinforcement Learning through Multi-Step Actions
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
The paradigm of Reinforcement Learning (RL) has been plagued by slow and uncertain training owing to the poor exploration in existing techniques. This can be mainly attributed to the lack of training data beforehand. Further, querying a neural network after every step is a wasteful process as some states are conducive to multi-step actions. Since we train with data generated on-the-fly, it is hard to pre-identify certain action sequences that consistently yield great rewards. Prior research in RL has been focused on designing algorithms that can train multiple agents in parallel and accumulate information from these agents to train faster. Concurrently, research has also been done to dynamically identify action sequences that are suited for a specific input state. In this work, we provide insights into the necessity and training methods for RL with multi-step action sequences in conjunction with the main actions of an RL environment. We broadly discuss two approaches. First of them is A4C - Anticipatory Asynchronous Advantage Actor-Critic, a method that squeezes twice the gradients from the same number of episodes and thereby achieves higher scores and converges faster. The second one is an alternative to Imitation Learning that mitigates the need for having state-action pairs of expert. With as few as 20 action trajectories of expert, we can identify the most frequent action pairs and append to the novice's action space. We show the power of our approaches by consistently and significantly outperforming the state-of-the-art GPU-enabled-A3C (GA3C) on popular ATARI games.
Description
Advisor
Degree
Type
Keywords
Citation
Medini, Tharun. "Enhancing Exploration in Reinforcement Learning through Multi-Step Actions." (2020) Master’s Thesis, Rice University. https://hdl.handle.net/1911/109644.