Tempo extraction using the discrete wavelet transform

HKUST Electronic Theses

Tempo extraction using the discrete wavelet transform

by Tsang Kei Man

THESIS 2006

M.Phil. Computer Science and Engineering

xx, 102 leaves : ill. ; 30 cm

Abstract

This thesis presents a method to extract the tempo from an audio file. First of all, we study the audio file for the beats; the interval between two successive beats is called the inter-onset interval (IOI). In order to investigate the inter-onset interval, two musicians were invited to conduct some experiments on the inter-onset intervals for a data set. This data set consists of 50 musical recordings which were extracted from audio CDs.

For our tempo extraction system, an audio file is read into memory and then a discrete wavelet transform (DWT) is applied. The input signal is then decomposed into four levels of DWT coefficients and a peak detection algorithm is performed to extract all peaks from these DWT coefficients. All the peaks are used to calculate the IOI. Some of them are more important for the IOI than others. So, a weight is introduced to each IOI in order to increase the accuracy of our system. We define the weight according to how many of the IOI's neighbors give similar values. All the weighted IOIs will form a histogram. The histogram is then smoothed out using a Gaussian function in order to better estimate the tempo.

For an input which is in stereo format, we treat it as three different inputs; the left channel, the right channel and the mono channel. The mono channel is the average of the left and right channels. We pass these three inputs into our system. Then, we can select the best one to be our final result.

The entire system was implemented using Matlab. We test our system using one data set of 50 musical recordings and one data set which had been used in a tempo extraction contest during the International Conference on Music Information Retrieval (ISMIR 2004). We obtained the correct tempo for 47 out of the 50 songs in our data set, achieving high accuracy. For the contest, there are in total two sets of data we can test with. Our ranking for one set is 2^nd out of 12 and the other set is 3^rd out of 12. This result shows that our system is competitive with the other algorithms used in the contest.

[ Hide abstract ]

View Copyrighted to the author. Reproduction is prohibited without the author’s prior written consent.

Details

Collection HKUST Electronic Theses Degree M.Phil. Department Computer Science and Engineering Authors Tsang, Kei Man Subjects Computer music Music Acoustics and physics Language English Call number Thesis COMP 2006 Tsang DOI 10.14711/thesis-b931221

Full record

Tempo extraction using the discrete wavelet transform

by Tsang Kei Man

Post a Comment Cancel reply