THESIS
2015
xvi, 130 pages : illustrations ; 30 cm
Abstract
Many real-world applications involve multilabel classification, in which multiple labels can be
associated with each sample. In many multilabel applications, structures exist among labels. A
popular structure on labels is the label hierarchy, which can be achieved with the help of domain
experts, or be automatically created from the data using procedures such as hierarchical clustering
or Bayesian network structure learning. This label hierarchy may then be arranged as a tree, as in
text categorization, or more generally, in a directed acyclic graph (DAG), as in the Gene Ontology
used in gene functional analysis. However, current research efforts typically ignore such label
structure or can only exploit the dependencies in a label tree.
Instead of a label hierarchy, some implici...[
Read more ]
Many real-world applications involve multilabel classification, in which multiple labels can be
associated with each sample. In many multilabel applications, structures exist among labels. A
popular structure on labels is the label hierarchy, which can be achieved with the help of domain
experts, or be automatically created from the data using procedures such as hierarchical clustering
or Bayesian network structure learning. This label hierarchy may then be arranged as a tree, as in
text categorization, or more generally, in a directed acyclic graph (DAG), as in the Gene Ontology
used in gene functional analysis. However, current research efforts typically ignore such label
structure or can only exploit the dependencies in a label tree.
Instead of a label hierarchy, some implicit structures may exist between labels. For instance,
some labels have strong correlations between each other. Examples can be found in text categorization
that an article on “sports” may also be labeled “entertainment”; and in image classification
that an image annotated with “jungle” may also be tagged with “bushes”. Besides the presence of
label correlations, we may not have access to all the true labels of each training sample in such
applications,. For example, many image annotation tasks use crowdsourcing platforms to collect
labels. For each image, the workers may only provide a small, incomplete set of answers to the
queried labels. Existing algorithms are often incapable of handling both label correlations and
missing labels.
In this thesis, we introduce various methods that exploit the label structure for multilabel classification.
We first explore the use of a label hierarchy. Specifically, we proposed three works
motivated by three different aspects of the problem. In the first work, we propose novel multilabel algorithms for the mandatory leaf node prediction problem, in which the prediction paths of a given
test example are required to end at leaf nodes of the label hierarchy. This problem setting is particularly
useful when the leaf nodes have much stronger semantic meaning than the internal nodes.
In the second work, we discuss proper loss functions for multilabel problem when label hierarchies
exist, and derive their corresponding Bayes-optimal classifiers. Thirdly, we present a probabilistic
framework by incorporating hierarchical label constraints via posterior regularization such that the
hierarchical constraints hold in expectation for the output labels during training. For the second
kind of label structure, we consider that certain correlations exist between labels. We propose a
probabilistic model that can simultaneously capture label correlations and handle missing labels.
Post a Comment