THESIS
2020
1 online resource (xxi, 177 pages) : color illustrations
Abstract
Robustness and generalization of deep neural networks are two ad-hoc topics.
Specifically, robust estimation under Huber’s contamination model has become
an important topic in statistics and theoretical computer science. Rate-optimal
procedures such as Tukey’s median and other estimators based on statistical
depth functions are impractical because of their computational intractability.
Margin enlargement over training data has been an important strategy since perceptrons
in machine learning for the purpose of boosting the robustness of classifiers
toward a generalization ability. In this paper, we first study rate-optimal
and computational feasible estimators under Huber’s contamination model, by
building connections between f-GANs, proper scoring rules and depth estimator.
For example,...[
Read more ]
Robustness and generalization of deep neural networks are two ad-hoc topics.
Specifically, robust estimation under Huber’s contamination model has become
an important topic in statistics and theoretical computer science. Rate-optimal
procedures such as Tukey’s median and other estimators based on statistical
depth functions are impractical because of their computational intractability.
Margin enlargement over training data has been an important strategy since perceptrons
in machine learning for the purpose of boosting the robustness of classifiers
toward a generalization ability. In this paper, we first study rate-optimal
and computational feasible estimators under Huber’s contamination model, by
building connections between f-GANs, proper scoring rules and depth estimator.
For example, we show that depth functions that lead to rate-optimal robust
estimators can all be viewed as variational lower bounds of the total variation distance
in the framework of f-Learning. Then we would revisit Breiman’s dilemma
in deep neural networks with recently proposed spectrally normalized margins.
A novel perspective is provided to explain Breiman’s dilemma based on phase
transitions in the dynamic of the normalized margin distribution, which reflects
the trade-o↵ between the expressive power of models and the complexity of data.
Post a Comment