Action Priors for Large Action Spaces in Robotics. (arXiv:2101.04178v1 [cs.RO])

In robotics, it is often not possible to learn useful policies using pure
model-free reinforcement learning without significant reward shaping or
curriculum learning. As a consequence, many researchers rely on expert
demonstrations to guide learning. However, acquiring expert demonstrations can
be expensive. This paper proposes an alternative approach where the solutions
of previously solved tasks are used to produce an action prior that can
facilitate exploration in future tasks. The action prior is a probability
distribution over actions that summarizes the set of policies found solving
previous tasks. Our results indicate that this approach can be used to solve
robotic manipulation problems that would otherwise be infeasible without expert



Related post