Robust Single-step Adversarial Training with Regularizer. (arXiv:2102.03381v1 [cs.LG])

High cost of training time caused by multi-step adversarial example
generation is a major challenge in adversarial training. Previous methods try
to reduce the computational burden of adversarial training using single-step
adversarial example generation schemes, which can effectively improve the
efficiency but also introduce the problem of catastrophic overfitting, where
the robust accuracy against Fast Gradient Sign Method (FGSM) can achieve nearby
100% whereas the robust accuracy against Projected Gradient Descent (PGD)
suddenly drops to 0% over a single epoch. To address this problem, we propose
a novel Fast Gradient Sign Method with PGD Regularization (FGSMPR) to boost the
efficiency of adversarial training without catastrophic overfitting. Our core
idea is that single-step adversarial training can not learn robust internal
representations of FGSM and PGD adversarial examples. Therefore, we design a
PGD regularization term to encourage similar embeddings of FGSM and PGD
adversarial examples. The experiments demonstrate that our proposed method can
train a robust deep network for L$_infty$-perturbations with FGSM adversarial
training and reduce the gap to multi-step adversarial training.



Related post