Music emotion classification from audio and lyrics

HKUST Electronic Theses

Music emotion classification from audio and lyrics

by Su Dan

THESIS 2012

M.Phil. Electronic and Computer Engineering

xii, 85 p. : ill. ; 30 cm

Abstract

In this thesis, we propose using Adaboost with decision trees to implement music emotion classification (MEC) using both audio and lyrics featuers, and further propose using active learning to address the user subjectivity issue in MEC.

Previous work on MER are mostly based on audio features, which revealed a glass ceiling as there is a semantic gap between the low-level audio features and the high level user perception. Meanwhile, traditional text categorizations methods using bag-of-words features and machine learning methods such as SVM do not perform well on MER from lyrics because lyrics tend to be much shorter than other documents. We propose to investigate (1) the best audio feature combination from different areas and (2) a machine learning method that is suitable for MER from a few lines of lyrics (3) combining both audio and lyrical features for the final MEC system.

We proposed a method of combining different audio features sets — music audio-based, psycho-acoustic-based and speech emotion-based features to classify songs into emotion categories. We obtained up to 91.4% in terms of accuracy on a four categories MEC task, and 66.8% on a 10 categories MEC task on a dataset of over 5000 songs, which are the highest to our knowledge on a music database of this scale.

We then proposed using Adaboost — an ensemble-based machine learning method for MER from lyrics. Adaboost combines many weak classifiers to model the presence or absence of salience phrases to make the correct classification. This approach is especially suitable for classification of short texts like lyrics. Our accuracy reached an average of 74.12% for classifying 3766 unique songs into 14 emotion categories, compared to an averag accuracy of 70.30% using the bestknown method, with statistically significant at 99.99% confidence.

Furthermore, we applied Adaboost on a multi-modal MEC system by combining audio and lyrical features, and an average accuracy of 78.19% has been achieved on the same dataset with statistically significant of 99.99% over the baseline.

Finally, we introduced active learning in a personazed music emotion classification system to address usubjectivity issue for a more user friendly MER. The only published personalized MER system to deal with subjectivity require high user participation, which is impractical in reality. Our proposed algorithm reduced about 80% user annotation efforts, without decreasing system performance.

[ Hide abstract ]

View Copyrighted to the author. Reproduction is prohibited without the author’s prior written consent.

Details

Collection HKUST Electronic Theses Degree M.Phil. Department Electronic and Computer Engineering Authors Su, Dan Subjects Music Psychological aspects Emotions Classification Data processing Machine learning Language English Call number Thesis ECED 2012 Su DOI 10.14711/thesis-b1198667

Full record

Music emotion classification from audio and lyrics

by Su Dan

Post a Comment Cancel reply