Generalization and robustness in deep neural networks

HKUST Electronic Theses

Generalization and robustness in deep neural networks

by Yifei Huang

THESIS 2021

Ph.D. Mathematics

1 online resource (xiv, 89 pages) : illustrations (some color)

Abstract

In this thesis, we focus on the theory part of deep neural networks including generalization and adversarial robustness which are essential problems in deep learning.

Generalization of deep neural networks is still a mystery although deep learning has been successfully applied to many areas. What we pursue is to provide an appropriate explanation for the success and failure of margin based Rademacher complexity bounds for generalization ability of deep neural networks. In traditional machine learning community, margin based Rademacher complexity have been used to explain the generalization of bagging and boosting and it is shown that generalization ability of these complex classifiers might be due to margin enlargement during training. However, Breiman shows examples that uniform improvements on training margins do not guarantee the decrease of generalization error, known as Breiman’s dilemma. We show that this phenomenon also exists in deep neural networks and explore its possible explanations. To reach this goal, we introduce the margin dynamics into deep neural networks to analyze the generalization abilities. A novel perspective is provided to explain the relationship between margin dynamics and generalization error based on some phase transitions in dynamics of normalized margin distributions. Large training margins may exhibit different dynamics to small margins, where the latter typically undergoes a monotone decay during training to reduce the loss, on the other hand the former may first drop and then grow. We find that such a phase transition is related to trade-off between the model expressive power and data complexity. It happens when the expressive power of deep neural networks is comparable to the data complexity, in this case improving small training margins one has to sacrifice the large margins. On the other hand, we show that Breiman’s dilemma appears in deep neural networks when models are over-expressive against data such that one can uniformly improve both large and small training margins, that loses the phase transitions above and fails the prediction of generalization error based on training margin distributions.

The adversarial robustness of deep neural networks is another problem in deep learning. For most of the existing adversarial defense methods, they need adversarial training to improve robustness of neural networks, hence have to make a trade-off between natural accuracy and adversarial robustness. Recently some work show that Neural Ordinary Differential Equations (ODEs) may exhibit certain adversarial robustness without sacrificing natural accuracy and it remains open whether such designs lead to genuine or fake robustness. Inspired by the dynamical system theory, we design a stabilized neural ODE network named SONet whose ODE blocks are skew-symmetric and proved to be stable in the sense of Lyapunov. With only natural training, SONet can achieve comparable robustness against gradient based attacks with the state-of-the-art adversarial training methods, without sacrificing natural accuracy. To understand the underlying mechanism behind this superb robustness, we explore deeper the relationship between numerical ODE solvers and gradient or gradient-free adversarial attacks. Our results disclose that the adversarial robustness of ODE-based networks mainly comes from the gradient masking effect in numerical ODE solvers with adaptive step sizes, hence leads to a false sense of adversarial robustness.

[ Hide abstract ]

View Copyrighted to the author. Reproduction is prohibited without the author’s prior written consent.

Details

Collection HKUST Electronic Theses Degree Ph.D. Department Mathematics Supervisors Yao, Yuan Authors Huang, Yifei Subjects Neural networks (Computer science) Mathematical models Machine learning Language English Call number Thesis MATH 2021 Huang DOI 10.14711/thesis-991012986098903412

Full record

Generalization and robustness in deep neural networks

by Yifei Huang

Post a Comment Cancel reply