Planning to Fairly Allocate: Probabilistic Fairness in the Restless Bandit Setting. (arXiv:2106.07677v1 [cs.LG])

Restless and collapsing bandits are commonly used to model constrained
resource allocation in settings featuring arms with action-dependent transition
probabilities, such as allocating health interventions among patients [Whittle,
1988; Mate et al., 2020]. However, state-of-the-art Whittle-index-based
approaches to this planning problem either do not consider fairness among arms,
or incentivize fairness without guaranteeing it [Mate et al., 2021].
Additionally, their optimality guarantees only apply when arms are indexable
and threshold-optimal. We demonstrate that the incorporation of hard fairness
constraints necessitates the coupling of arms, which undermines the
tractability, and by extension, indexability of the problem. We then introduce
ProbFair, a probabilistically fair stationary policy that maximizes total
expected reward and satisfies the budget constraint, while ensuring a strictly
positive lower bound on the probability of being pulled at each timestep. We
evaluate our algorithm on a real-world application, where interventions support
continuous positive airway pressure (CPAP) therapy adherence among obstructive
sleep apnea (OSA) patients, as well as simulations on a broader class of
synthetic transition matrices.



Related post