THESIS
2012
xii, 85 p. : ill. ; 30 cm
Abstract
In this thesis, we propose using Adaboost with decision trees to implement music
emotion classification (MEC) using both audio and lyrics featuers, and further propose
using active learning to address the user subjectivity issue in MEC.
Previous work on MER are mostly based on audio features, which revealed a glass
ceiling as there is a semantic gap between the low-level audio features and the high level
user perception. Meanwhile, traditional text categorizations methods using bag-of-words
features and machine learning methods such as SVM do not perform well on MER from
lyrics because lyrics tend to be much shorter than other documents. We propose to investigate
(1) the best audio feature combination from different areas and (2) a machine
learning method that is suitable for M...[
Read more ]
In this thesis, we propose using Adaboost with decision trees to implement music
emotion classification (MEC) using both audio and lyrics featuers, and further propose
using active learning to address the user subjectivity issue in MEC.
Previous work on MER are mostly based on audio features, which revealed a glass
ceiling as there is a semantic gap between the low-level audio features and the high level
user perception. Meanwhile, traditional text categorizations methods using bag-of-words
features and machine learning methods such as SVM do not perform well on MER from
lyrics because lyrics tend to be much shorter than other documents. We propose to investigate
(1) the best audio feature combination from different areas and (2) a machine
learning method that is suitable for MER from a few lines of lyrics (3) combining both
audio and lyrical features for the final MEC system.
We proposed a method of combining different audio features sets — music audio-based,
psycho-acoustic-based and speech emotion-based features to classify songs into
emotion categories. We obtained up to 91.4% in terms of accuracy on a four categories
MEC task, and 66.8% on a 10 categories MEC task on a dataset of over 5000 songs, which
are the highest to our knowledge on a music database of this scale.
We then proposed using Adaboost — an ensemble-based machine learning method
for MER from lyrics. Adaboost combines many weak classifiers to model the presence or
absence of salience phrases to make the correct classification. This approach is especially
suitable for classification of short texts like lyrics. Our accuracy reached an average of
74.12% for classifying 3766 unique songs into 14 emotion categories, compared to an
averag accuracy of 70.30% using the bestknown method, with statistically significant at
99.99% confidence.
Furthermore, we applied Adaboost on a multi-modal MEC system by combining audio
and lyrical features, and an average accuracy of 78.19% has been achieved on the
same dataset with statistically significant of 99.99% over the baseline.
Finally, we introduced active learning in a personazed music emotion classification
system to address usubjectivity issue for a more user friendly MER. The only published
personalized MER system to deal with subjectivity require high user participation, which
is impractical in reality. Our proposed algorithm reduced about 80% user annotation
efforts, without decreasing system performance.
Post a Comment