Multi-input and multi-output machine learning are some of the chief challenges in the era of
big data (Variety of the data). These big datasets are too large and too complex to be handled
by traditional machine learning methods and new solutions must be found. In this thesis, we
investigate the effect of dependencies between multiple input and multiple output, and we show
that these dependencies help to solve the problems in a more accurate and less expensive way
with fewer parameters. We choose prediction tasks on multi-label learning where each label is
equivalent to an output task, and multimodal learning where each modality is equivalent to an
input channel, as case studies
Multi-label learning is an example of an extreme classification task on an extremely large
number of labels (t...[
Read more ]
Multi-input and multi-output machine learning are some of the chief challenges in the era of
big data (Variety of the data). These big datasets are too large and too complex to be handled
by traditional machine learning methods and new solutions must be found. In this thesis, we
investigate the effect of dependencies between multiple input and multiple output, and we show
that these dependencies help to solve the problems in a more accurate and less expensive way
with fewer parameters. We choose prediction tasks on multi-label learning where each label is
equivalent to an output task, and multimodal learning where each modality is equivalent to an
input channel, as case studies
Multi-label learning is an example of an extreme classification task on an extremely large
number of labels (tags). User generated labels for any type of online data can be sparse in terms
of the individual user but intractably large among all users. For example, in web and document
categorization, image semantic analysis, protein function detection and social network analysis,
multiple outputs must be predicted simultaneously. In these problems, modelling output label
dependencies improves the output predictions. Many of the existing algorithms do not adequately
address multi-label classifications with label dependencies and a large number of labels. In this
thesis, we investigate multi-label classification with dependencies between many labels. We can
then efficiently solve the problem of multi-label learning with an intractably large number of
interdependent labels, such as the automatic tagging of Wikipedia pages.
In this thesis, we have studied the nature of label dependencies and the efficiency of distributed
multi-label learning methods. Then, we have proposed an assumption-free label sampling
approach to handle a huge number of the labels. Finally, we have investigated and compared
chain-ordered label dependency and order-free learning methods for multi-label datasets.
In the second part of our dependency challenge investigation, we investigate multimodal
learning complexities, as most of the learning tasks include several sensory modalities, such as
vision and speech, which represent our primary channels of communication and perception. We
focus on how to utilize the modality dependencies for multimodal fusion in order to integrate
information from two or more modalities for better prediction.
Our aim is to understand and modulate the relative contribution of each modality in multimodal
inference tasks by investigating input modality dependencies. Moreover, we propose
some solutions to solve the curse of dimensionality which happens by high-order integratiion of
the data from several sources. We make several contributions to multimodal data processing:
First, we have investigated various basic fusion methods. In contrast to the previous approaches
which use simple linear or concatenation approaches, we propose to generate an (M + 1)-way
high-order dependency structure (tensor) to consider the high-order relationships between M
modalities and the output layer of a neural network model. Applying a modality-based tensor
factorization method, which adopts different factors for different modalities, results in removing
information present in a modality that can be compensated by other modalities, with respect
to the model outputs. Moreover, this modality-based tensor factorization approach helps in
understanding of the relative utility of information in each modality and handles the scale issues
of the problem. In addition, it leads to a less complicated model with fewer parameters and
therefore could be applied as a regularizer to avoid overfitting.
According to our investigations and the experimental results, we find that including the dependencies
in the prediction tasks lead to the approaches with simpler models and fewer parameters,
while improving the prediction results. We aim to use the challenge of the dimensionality of
big data as an opportunity by extracting their dependencies and using them as extra information
to solve the prediction problems. We have shown that divide and conquer based on the label
dependencies results in a smaller but more accurate method in comparison to the methods which
ignore the dependencies. Then, we have shown that a small subset of the labels could provide a
lot of information about the remaining labels, therefore we can use a small subset to perform the
prediction tasks. Then, we have investigated the order-based dependency extraction vs order-free
methods which concludes the superiority of the order-free methods which are more general and
accurate especially for the larger datasets. We have shown that a high-order integration of the
modalities represents more information of the inter and intra modality dependencies, however
it suffers from the polynomial growth of the dimensionality. Therefore, we propose a fully
differentiable framework based on tensor factorization which could be included in any neural
based learning method. In a nutshell, our results demonstrate that the dependencies between
multiple inputs or outputs could help to make the problem simpler, smaller, and easier to train
by combining the prediction tasks with dependency-based sampling, compression, or clustering
methods.
Post a Comment