Remember What You Want to Forget: Algorithms for Machine Unlearning. (arXiv:2103.03279v1 [cs.LG])

We study the problem of forgetting datapoints from a learnt model. In this
case, the learner first receives a dataset $S$ drawn i.i.d. from an unknown
distribution, and outputs a predictor $w$ that performs well on unseen samples
from that distribution. However, at some point in the future, any training data
point $z in S$ can request to be unlearned, thus prompting the learner to
modify its output predictor while still ensuring the same accuracy guarantees.
In our work, we initiate a rigorous study of machine unlearning in the
population setting, where the goal is to maintain performance on the unseen
test loss. We then provide unlearning algorithms for convex loss functions.

For the setting of convex losses, we provide an unlearning algorithm that can
delete up to $O(n/d^{1/4})$ samples, where $d$ is the problem dimension. In
comparison, in general, differentially private learningv(which implies
unlearning) only guarantees deletion of $O(n/d^{1/2})$ samples. This shows that
unlearning is at least polynomially more efficient than learning privately in
terms of dependence on $d$ in the deletion capacity.



Related post