TY - GEN
T1 - Defense-Net
T2 - 18th IEEE Computer Society Annual Symposium on VLSI, ISVLSI 2019
AU - Rakin, Adnan Siraj
AU - Fan, Deliang
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/7
Y1 - 2019/7
N2 - Recent studies have demonstrated that Deep Neural Networks(DNNs) are vulnerable to adversarial input perturbations: meticulously engineered slight perturbations can result in inappropriate categorization of valid images. Adversarial Training has been one of the successful defense approaches in recent times. In this work, we propose an alternative to adversarial training by training a separate model with adversarial examples instead of the original classifier. We train an adversarial detector network known as 'Defense-Net' with strong adversary while training the original classifier with only clean training data. We propose a new adversarial cross entropy loss function to train Defense-Net appropriately differentiate between different adversarial examples. Defense-Net solves three major concerns regarding the development of a successful adversarial defense method. First, our defense does not have clean data accuracy degradation in contrast to traditional adversarial training based defenses. Second, we demonstrate this resiliency with experiments on the MNIST and CIFAR-10 data sets, and show that the state-of-the-art accuracy under the most powerful known white-box attack was increased from 94.02 % to 99.2 % on MNIST, and 47 % to 94.79 % on CIFAR-10. Finally, unlike most recent defenses, our approach does not suffer from obfuscated gradient and can successfully defend strong BPDA, PGD, FGSM and C & W attacks.
AB - Recent studies have demonstrated that Deep Neural Networks(DNNs) are vulnerable to adversarial input perturbations: meticulously engineered slight perturbations can result in inappropriate categorization of valid images. Adversarial Training has been one of the successful defense approaches in recent times. In this work, we propose an alternative to adversarial training by training a separate model with adversarial examples instead of the original classifier. We train an adversarial detector network known as 'Defense-Net' with strong adversary while training the original classifier with only clean training data. We propose a new adversarial cross entropy loss function to train Defense-Net appropriately differentiate between different adversarial examples. Defense-Net solves three major concerns regarding the development of a successful adversarial defense method. First, our defense does not have clean data accuracy degradation in contrast to traditional adversarial training based defenses. Second, we demonstrate this resiliency with experiments on the MNIST and CIFAR-10 data sets, and show that the state-of-the-art accuracy under the most powerful known white-box attack was increased from 94.02 % to 99.2 % on MNIST, and 47 % to 94.79 % on CIFAR-10. Finally, unlike most recent defenses, our approach does not suffer from obfuscated gradient and can successfully defend strong BPDA, PGD, FGSM and C & W attacks.
KW - Adversarial Defense
KW - Detector
KW - Robustness
UR - http://www.scopus.com/inward/record.url?scp=85072959550&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85072959550&partnerID=8YFLogxK
U2 - 10.1109/ISVLSI.2019.00067
DO - 10.1109/ISVLSI.2019.00067
M3 - Conference contribution
AN - SCOPUS:85072959550
T3 - Proceedings of IEEE Computer Society Annual Symposium on VLSI, ISVLSI
SP - 332
EP - 337
BT - Proceedings - 2019 IEEE Computer Society Annual Symposium on VLSI, ISVLSI 2019
PB - IEEE Computer Society
Y2 - 15 July 2019 through 17 July 2019
ER -