THESIS
2022
1 online resource (vii, 63 pages) : color illustrations
Abstract
An intriguing property of deep neural networks is that adversarial attacks can transfer across different
models. Existing methods such as the Intermediate Level Attack (ILA) further improve
black-box transferability by fine-tuning a reference adversarial attack, so as to maximize the perturbation
on a pre-specified layer of the source model. In this work, we revisit ILA and evaluate
the effect of applying augmentation to the images before passing them to ILA. We start by looking
into the effect of common image augmentation techniques and exploring novel augmentation
with the aid of adversarial perturbations. Based on the observations, we propose Aug-ILA, an improved
method that enhances the transferability of an existing attack under the ILA framework.
Specifically, Aug-ILA has three ma...[
Read more ]
An intriguing property of deep neural networks is that adversarial attacks can transfer across different
models. Existing methods such as the Intermediate Level Attack (ILA) further improve
black-box transferability by fine-tuning a reference adversarial attack, so as to maximize the perturbation
on a pre-specified layer of the source model. In this work, we revisit ILA and evaluate
the effect of applying augmentation to the images before passing them to ILA. We start by looking
into the effect of common image augmentation techniques and exploring novel augmentation
with the aid of adversarial perturbations. Based on the observations, we propose Aug-ILA, an improved
method that enhances the transferability of an existing attack under the ILA framework.
Specifically, Aug-ILA has three main characteristics: typical image augmentation such as random
cropping and resizing applied to all ILA inputs, reverse adversarial update on the clean image, and
interpolation between two attacks on the reference image. Our experimental results show that Aug-ILA outperforms ILA and its subsequent variants, as well as state-of-the-art transfer-based attacks, by achieving 96.99% and 87.84% average attack success rates with perturbation budgets 13/255
(0.05) and 8/255 (0.03), respectively, on nine undefended models.
Besides, being a strong transfer-based attack, Aug-ILA can also be adopted for adversarial
training. We propose a two-phase training scheme which aims to both speed up the training time
and also achieve better robustness compared to previous works. Having a pre-training phase using
an existing framework, we further employ Aug-ILA to fine-tune the model. Extensive experiments
illustrate that Aug-ILA can boost the model robustness up to 5% while the model can still converge
in a reasonable time.
Post a Comment