Planning with Uncertainty: Deep Exploration in Model-Based Reinforcement Learning. (arXiv:2210.13455v1 [cs.LG])

Deep model-based Reinforcement Learning (RL) has shown super-human
performance in many challenging domains. Low sample efficiency and limited
exploration remain as leading obstacles in the field, however. In this paper,
we demonstrate deep exploration in model-based RL by incorporating epistemic
uncertainty into planning trees, circumventing the standard approach of
propagating uncertainty through value learning. We evaluate this approach with
the state of the art model-based RL algorithm MuZero, and extend its training
process to stabilize learning from explicitly-exploratory trajectories. In our
experiments planning with uncertainty is able to demonstrate effective deep
exploration with standard uncertainty estimation mechanisms, and with it
significant gains in sample efficiency.



Related post