THESIS
2019
xii, 106 pages : illustrations ; 30 cm
Abstract
Learning from data is the central ability of machine learning and modern artificial intelligence
systems. Deep learning provides the powerful capability of learning representations
from data, and has achieved great success for many perception tasks, such as visual object
recognition and speech recognition. While deep learning excels at learning representations
from data, probabilistic graphical models (PGMs) excel at learning statistical relationships
among variables (reasoning) and learning model structures (structure learning) from data.
Both capabilities are important to machine intelligence, and have mutual benefits to each
other. The representation learning ability of deep neural networks can be incorporated into
probabilistic graphical models to enhance the reasoning capa...[
Read more ]
Learning from data is the central ability of machine learning and modern artificial intelligence
systems. Deep learning provides the powerful capability of learning representations
from data, and has achieved great success for many perception tasks, such as visual object
recognition and speech recognition. While deep learning excels at learning representations
from data, probabilistic graphical models (PGMs) excel at learning statistical relationships
among variables (reasoning) and learning model structures (structure learning) from data.
Both capabilities are important to machine intelligence, and have mutual benefits to each
other. The representation learning ability of deep neural networks can be incorporated into
probabilistic graphical models to enhance the reasoning capabilities. On the other hand,
the reasoning and structure learning ability of probabilistic graphical models can be useful
to improve the power of deep neural networks and learn the model structures for them.
The synergy between neural and probabilistic machine learning provides more powerful and
flexible tools for learning data representations and model structures.
The aim of this thesis is to advance both deep learning and probabilistic graphical models
fields by harnessing the synergy between them for unsupervised representation and structure
learning. In this thesis, we focus on two parts: learning the representations for probabilistic
graphical models with deep learning and learning the structures for deep learning with
probabilistic graphical models. The capability of deep neural networks and the
flexibility
of probabilistic graphical models make the methods suitable for various supervised and unsupervised
tasks, such as recommender systems, social network analysis, classification and
cluster analysis. The contributions of this thesis are as follows.
First, we propose Collaborative Variational Autoencoder (CVAE) and Relational Variational
Autoencoder (RVAE) to bring deep generative models like Variational Autoencoder
(VAE) into probabilistic graphical models to perform representation learning on high dimensional
data for supervised tasks, such as recommendation and link prediction. Joint learning
algorithms involving variational and amortized inference are proposed to enable the learning
of such models.
Second, we propose Tree-Receptive-Field network (TRF-net) to automatically learn a
sparsely-connected multilayer of feedforward neural networks from scratch in an unsupervised
way. With the analogy of sparse connectivity in convolutional networks, we learn the sparse
structure of feedforward neural networks by learning probabilistic structures among variables
from data in an unsupervised way, utilizing rich information in data beyond class labels, which
are often discarded in supervised classification.
Finally, we propose Latent Tree Variational Autoencoder (LTVAE) to learn the latent
superstructures in variational autoencoder and simutaneously perform unsupervised representation
and structure learning for multidimensional cluster analysis. Cluster analysis for
high-dimensional data, such as images and texts, are challenging, and often real-world data
have multiple valid ways of clustering rather than just one. We seek to simutaneously learn
the representations of high-dimensional data and perform multi-facet clustering in a single
model. Learning algorithms using StepwiseEM with message passing have been proposed for
end-to-end learning of deep neural networks and Bayesian networks.
Post a Comment