How Much Do Unstated Problem Constraints Limit Deep Robotic Reinforcement Learning?

Lewis, W. Cannon II; Moll, Mark; Kavraki, Lydia E.

How Much Do Unstated Problem Constraints Limit Deep Robotic Reinforcement Learning?

Files

TR19-01.pdf (905.53 KB)

Date

2019

Authors

Lewis, W. Cannon II

Moll, Mark

Kavraki, Lydia E.

Abstract

Deep Reinforcement Learning is a promising paradigm for robotic control which has been shown to be capable of learning policies for high-dimensional, continuous control of unmodeled systems. However, Robotic Reinforcement Learning currently lacks clearly defined benchmark tasks, which makes it difficult for researchers to reproduce and compare against prior work. “Reacher” tasks, which are fundamental to robotic manipulation, are commonly used as benchmarks, but the lack of a formal specification elides details that are crucial to replication. In this paper we present a novel empirical analysis which shows that the unstated spatial constraints in commonly used implementations of Reacher tasks make it dramatically easier to learn a successful control policy with Deep Deterministic Policy Gradients (DDPG), a state-of-the-art Deep RL algorithm. Our analysis suggests that less constrained Reacher tasks are significantly more difficult to learn, and hence that existing de facto benchmarks are not representative of the difficulty of general robotic manipulation.

Type

Technical report

Citation

Lewis, W. Cannon II, Moll, Mark and Kavraki, Lydia E.. "How Much Do Unstated Problem Constraints Limit Deep Robotic Reinforcement Learning?." (2019) https://doi.org/10.25611/az5z-xt37.

Published Version

https://doi.org/10.25611/az5z-xt37

Rights

You are granted permission for the noncommercial reproduction, distribution, display, and performance of this technical report in any format, but this permission is only for a period of forty-five (45) days from the most recent time that you verified that this technical report is still available from the Computer Science Department of Rice University under terms that include this permission. All other rights are reserved by the author(s).

Citable link to this page

https://hdl.handle.net/1911/107403

Collections

Computer Science Technical Reports

Full item page