Identification of live or studio versions of a song via supervised and semi-supervised learning

HKUST Electronic Theses

Identification of live or studio versions of a song via supervised and semi-supervised learning

by Nicolas Jean Leopold Pierre Auguin

THESIS 2014

M.Phil. Electronic and Computer Engineering

x, 47 pages : illustrations ; 30 cm

Abstract

In this thesis, we aim to classify "live" and "studio" versions of a song using audio features. We solve this problem using supervised machine learning techniques, and then address the issue of data scarcity by using a co-training algorithm in a semi-supervised setting. This issue has rarely been addressed before, but is of paramount importance. Indeed, many online music databases, such as Youtube videos, are user-generated and therefore are very mixed in terms of quality. Consequently, the listening experience of the user of online streaming services is adversely affected. In this work, we are particularly interested in knowing whether the song played by a potential listener is the original studio version or a secondary live recording.

As manual labeling can be tedious and challenging in practice, we first propose to classify automatically a music data set by using machine learning techniques under a supervised setting, using only the audio content of the songs. We show which segments of the songs are more relevant to distinguishing between "live" and "studio" songs and discuss the relative importance of audio, acoustic and music features on this classification task. We then propose to implement a more robust system by using multi-ensemble learning. Exploiting the diversity of different classifiers, we apply stacked generalization to our classification task and obtain up to 92.82% in terms of global accuracy, on a 1066-song data set.

Finally, we tackle this classification problem under a semi-supervised setting. Specifically, we are interested in cases where very little-annotated training data is available, and we demonstrate how an original co-training algorithm can alleviate the problem of data scarcity by using a large, unlabeled data set. This method is proven to give significantly better results (with 10%-absolute accuracy improvement when only 15 examples are initially annotated) than classifiers trained only on the initially-annotated data set.

[ Hide abstract ]

View Copyrighted to the author. Reproduction is prohibited without the author’s prior written consent.

Details

Collection HKUST Electronic Theses Degree M.Phil. Department Electronic and Computer Engineering Authors Auguin, Nicolas Jean Leopold Pierre Subjects Musical analysis Data processing Sound-waves Analysis Language English Call number Thesis ECED 2014 Auguin DOI 10.14711/thesis-b1274302

Full record

Identification of live or studio versions of a song via supervised and semi-supervised learning

by Nicolas Jean Leopold Pierre Auguin

Post a Comment Cancel reply