On the generation, understanding and defense of adversarial examples in deep learning

HKUST Electronic Theses

On the generation, understanding and defense of adversarial examples in deep learning

by Zhichao Huang

THESIS 2022

Ph.D. Mathematics

1 online resource (xxii, 214 pages) : color illustrations

Abstract

Although deep learning has many practical applications, it is known that deep neural networks are vulnerable to adversarial examples, which are small perturbations of inputs that can fool neural networks into making wrong predictions. In this thesis, we propose new methods and theories to evaluate, understand and improve the adversarial robustness of deep neural networks.

Firstly, we investigate the black-box adversarial attacks, where the attacker has no information about the target model except for its output. We propose two new methods, ZOHA and TREMBA, to accelerate the black-box attack. In ZOHA, the second order information is incorporated into the zeroth-order optimization. In TREMBA, we utilize the transferability of adversarial examples, developing a new search space and greatly reducing the number of queries for black-box attacks. These algorithms demonstrate that the black-box attack can be practical threats to the practical models.

Secondly, we study the existence of adversarial examples and its relationship to benign overfitting. We provide a theoretical explanation for why adversarial examples exist in standard training of neural networks: adversarial examples are by-products of overfitting the noise in the overparameterized models. Moreover, our theory explains the trade-off between robustness and clean performance.

Lastly, we improve the poor generalization of adversarial training by a novel test-time fine-tuning strategy. Standard adversarial training does not necessarily achieve near optimal generalization performance on test samples. Bayesian optimal robust estimator requires test-time adaptation, and such adaptation can lead to significantly better performance. Motivated by this observation, we propose a practically easy-to-implement method that fine-tunes the adversarially-trained networks with an additional self-supervised test-time adaptation step. And we introduce a meta adversarial training method to find a good starting point for test-time fine-tuning. The empirical experiments also demonstrate the effectiveness of the proposed strategy.

[ Hide abstract ]

View Copyrighted to the author. Reproduction is prohibited without the author’s prior written consent.

Details

Collection HKUST Electronic Theses Degree Ph.D. Department Mathematics Supervisors Zhang, Tong Authors Huang, Zhichao Subjects Deep learning (Machine learning) Security measures Mathematical models Data structures (Computer science) Machine learning Computer security Data processing Language English Call number Thesis MATH 2022 Huang DOI 10.14711/thesis-991013141858503412

Full record

On the generation, understanding and defense of adversarial examples in deep learning

by Zhichao Huang

Post a Comment Cancel reply