Though deep reinforcement learning agents have achieved unprecedented success
in recent years, their learned policies can be brittle, failing to generalize
to even slight modifications of their environments or unfamiliar situations.
The black-box nature of the neural network learning dynamics makes it
impossible to audit trained deep agents and recover from such failures. In this
paper, we propose a novel representation and learning approach to capture
environment dynamics without using neural networks. It originates from the
observation that, in games designed for people, the effect of an action can
often be perceived in the form of local changes in consecutive visual
observations. Our algorithm is designed to extract such vision-based changes
and condense them into a set of action-dependent descriptive rules, which we
call ”visual rewrite rules” (VRRs). We also present preliminary results from
a VRR agent that can explore, expand its rule set, and solve a game via
planning with its learned VRR world model. In several classical games, our
non-deep agent demonstrates superior performance, extreme sample efficiency,
and robust generalization ability compared with several mainstream deep agents.