Towards mental time travel: a hierarchical memory for reinforcement learning agents. (arXiv:2105.14039v1 [cs.LG])
Reinforcement learning agents often forget details of the past, especially
after delays or distractor tasks. Agents with common memory architectures
struggle to recall and integrate across multiple timesteps of a past event, or
even to recall the details of a single timestep that is followed by distractor
tasks. To address these limitations, we propose a Hierarchical Transformer
Memory (HTM), which helps agents to remember the past in detail. HTM stores
memories by dividing the past into chunks, and recalls by first performing
high-level attention over coarse summaries of the chunks, and then performing
detailed attention within only the most relevant chunks. An agent with HTM can
therefore “mentally time-travel” — remember past events in detail without
attending to all intervening events. We show that agents with HTM substantially
outperform agents with other memory architectures at tasks requiring long-term
recall, retention, or reasoning over memory. These include recalling where an
object is hidden in a 3D environment, rapidly learning to navigate efficiently
in a new neighborhood, and rapidly learning and retaining new object names.
Agents with HTM can extrapolate to task sequences an order of magnitude longer
than they were trained on, and can even generalize zero-shot from a
meta-learning setting to maintaining knowledge across episodes. HTM improves
agent sample efficiency, generalization, and generality (by solving tasks that
previously required specialized architectures). Our work is a step towards
agents that can learn, interact, and adapt in complex and temporally-extended
environments.
Source: https://arxiv.org/abs/2105.14039