An Online Learning Approach to Optimizing Time-Varying Costs of AoI. (arXiv:2105.13383v1 [cs.NI])

We consider systems that require timely monitoring of sources over a
communication network, where the cost of delayed information is unknown,
time-varying and possibly adversarial. For the single source monitoring
problem, we design algorithms that achieve sublinear regret compared to the
best fixed policy in hindsight. For the multiple source scheduling problem, we
design a new online learning algorithm called
Follow-the-Perturbed-Whittle-Leader and show that it has low regret compared to
the best fixed scheduling policy in hindsight, while remaining computationally
feasible. The algorithm and its regret analysis are novel and of independent
interest to the study of online restless multi-armed bandit problems. We
further design algorithms that achieve sublinear regret compared to the best
dynamic policy when the environment is slowly varying. Finally, we apply our
algorithms to a mobility tracking problem. We consider non-stationary and
adversarial mobility models and illustrate the performance benefit of using our
online learning algorithms compared to an oblivious scheduling policy.

Source: https://arxiv.org/abs/2105.13383

webmaster

Related post