Lessons from AlphaZero for Optimal, Model Predictive, and Adaptive Control. (arXiv:2108.10315v1 [math.OC])

In this paper we aim to provide analysis and insights (often based on
visualization), which explain the beneficial effects of on-line decision making
on top of off-line training. In particular, through a unifying abstract
mathematical framework, we show that the principal AlphaZero/TD-Gammon ideas of
approximation in value space and rollout apply very broadly to deterministic
and stochastic optimal control problems, involving both discrete and continuous
search spaces. Moreover, these ideas can be effectively integrated with other
important methodologies such as model predictive control, adaptive control,
decentralized control, discrete and Bayesian optimization, neural network-based
value and policy approximations, and heuristic algorithms for discrete

Source: https://arxiv.org/abs/2108.10315


Related post