Statistical and structural properties of generative models

HKUST Electronic Theses

Statistical and structural properties of generative models

by Xinwei Shen

THESIS 2022

Ph.D. Mathematics

1 online resource (xxiii, 315 pages) : illustrations (chiefly color)

Abstract

Generative models have received considerable interest in modern machine learning and statistics as a method for data generation and representation learning. Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs) are the two important classes of implicit generative modeling methods, which model the transformation between the latent variable and data variable to simulate the sampling process without specifying probability distributions explicitly. Owing to the recent development of deep learning, generative models have yielded remarkable empirical performance in a wide range of applications.

Despite the empirical success of generative models, their theoretical properties were less justified, especially those of GANs. This motivates the first thrust of this thesis, which is statistical analysis of f-divergence GANs. Our theory gives rise to a new class of GAN algorithms with higher statistical efficiency and sheds light on the statistical problems including the relationship between the modern algorithm (GAN) and the classical method (maximum likelihood estimation) as well as how various f-divergences behave. We also provide a unified view of GAN and VAE under the principled framework of bidirectional generative models. In addition, we extensively adapt our proposed methods to practical tasks in computer vision and natural language processing and achieve state-of-the-art performance. In particular, we present a new model architecture and learning formulation based on our efficient GAN approach for coherent and diverse text generation.

Structures are pervasive and inherent in human’s recognition and understanding of the real world. The second part of this thesis shifts the focus to the structural properties of generative models. An emerging field regarding this is disentangled representation learning that starts with the premise that real-world data is generated by a few explanatory factors and aims at recovering the generative factors as well as their underlying structure. Disentangled representations of data have numerous benefits in the interpretability of deep learning models, downstream learning tasks, and controllable generation. The difficulty of disentanglement depends on the amount of supervision available as well as the complexity of the underlying structures. It is acknowledged that disentanglement is impossible in a fully unsupervised setting. Existing disentanglement literature mostly considers simple structures such as independence or conditional independence given some observed auxiliary variables, while a more general (and challenging) structure is the causal structure where the underlying factors are connected by a causal graph. We formalize the failure of previous methods in the causal case and propose a method for disentangling the causal factors based on a bidirectional generative model with a causal prior. We provide theoretical justification on the identifiability and asymptotic convergence of the proposed algorithm. Finally, we develop a nonparametric method to learn causal structures from observational data.

[ Hide abstract ]

View Copyrighted to the author. Reproduction is prohibited without the author’s prior written consent.

Details

Collection HKUST Electronic Theses Degree Ph.D. Department Mathematics Supervisors Zhang, Tong Chen, Kani Authors Shen, Xinwei Subjects Machine learning Data processing Mathematical models Electronic data processing Data structures (Computer science) Computer algorithms Language English Call number Thesis MATH 2022 Shen DOI 10.14711/thesis-991013098359303412

Full record

Statistical and structural properties of generative models

by Xinwei Shen

Post a Comment Cancel reply