Enhancing Exploration in Reinforcement Learning through Multi-Step Actions

dc.contributor.advisorShrivastava, Anshumali
dc.creatorMedini, Tharun
dc.date.accessioned2020-12-10T17:37:28Z
dc.date.available2020-12-10T17:37:28Z
dc.date.created2020-12
dc.date.issued2020-12-03
dc.date.submittedDecember 2020
dc.date.updated2020-12-10T17:37:28Z
dc.description.abstractThe paradigm of Reinforcement Learning (RL) has been plagued by slow and uncertain training owing to the poor exploration in existing techniques. This can be mainly attributed to the lack of training data beforehand. Further, querying a neural network after every step is a wasteful process as some states are conducive to multi-step actions. Since we train with data generated on-the-fly, it is hard to pre-identify certain action sequences that consistently yield great rewards. Prior research in RL has been focused on designing algorithms that can train multiple agents in parallel and accumulate information from these agents to train faster. Concurrently, research has also been done to dynamically identify action sequences that are suited for a specific input state. In this work, we provide insights into the necessity and training methods for RL with multi-step action sequences in conjunction with the main actions of an RL environment. We broadly discuss two approaches. First of them is A4C - Anticipatory Asynchronous Advantage Actor-Critic, a method that squeezes twice the gradients from the same number of episodes and thereby achieves higher scores and converges faster. The second one is an alternative to Imitation Learning that mitigates the need for having state-action pairs of expert. With as few as 20 action trajectories of expert, we can identify the most frequent action pairs and append to the novice's action space. We show the power of our approaches by consistently and significantly outperforming the state-of-the-art GPU-enabled-A3C (GA3C) on popular ATARI games.
dc.format.mimetypeapplication/pdf
dc.identifier.citationMedini, Tharun. "Enhancing Exploration in Reinforcement Learning through Multi-Step Actions." (2020) Master’s Thesis, Rice University. <a href="https://hdl.handle.net/1911/109644">https://hdl.handle.net/1911/109644</a>.
dc.identifier.urihttps://hdl.handle.net/1911/109644
dc.language.isoeng
dc.rightsCopyright is held by the author, unless otherwise indicated. Permission to reuse, publish, or reproduce the work beyond the bounds of fair use or other exemptions to copyright law must be obtained from the copyright holder.
dc.subjectReinforcement Learning
dc.subjectImitation Learning
dc.subjectMachine Learning
dc.subjectATARI
dc.subjectDeepMind
dc.subjectA3C
dc.subjectGA3C
dc.subjectActor Critic
dc.titleEnhancing Exploration in Reinforcement Learning through Multi-Step Actions
dc.typeThesis
dc.type.materialText
thesis.degree.departmentElectrical and Computer Engineering
thesis.degree.disciplineEngineering
thesis.degree.grantorRice University
thesis.degree.levelMasters
thesis.degree.nameMaster of Science
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
MEDINI-DOCUMENT-2020.pdf
Size:
1.93 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 2 of 2
No Thumbnail Available
Name:
PROQUEST_LICENSE.txt
Size:
5.84 KB
Format:
Plain Text
Description:
No Thumbnail Available
Name:
LICENSE.txt
Size:
2.61 KB
Format:
Plain Text
Description: