Backdoor for Debias: Mitigating Model Bias with Backdoor Attack-based Artificial Bias. (arXiv:2303.01504v1 [cs.LG])
With the swift advancement of deep learning, state-of-the-art algorithms have
been utilized in various social situations. Nonetheless, some algorithms have
been discovered to exhibit biases and provide unequal results. The current
debiasing methods face challenges such as poor utilization of data or intricate
training requirements. In this work, we found that the backdoor attack can
construct an artificial bias similar to the model bias derived in standard
training. Considering the strong adjustability of backdoor triggers, we are
motivated to mitigate the model bias by carefully designing reverse artificial
bias created from backdoor attack. Based on this, we propose a backdoor
debiasing framework based on knowledge distillation, which effectively reduces
the model bias from original data and minimizes security risks from the
backdoor attack. The proposed solution is validated on both image and
structured datasets, showing promising results. This work advances the
understanding of backdoor attacks and highlights its potential for beneficial
applications. The code for the study can be found at
url{https://anonymous.4open.science/r/DwB-BC07/}.
Source: https://arxiv.org/abs/2303.01504