THESIS
2014
xiv, 76 pages : illustrations ; 30 cm
Abstract
Cantonese is a very popular spoken language/dialect, which is well known for its rich set of nine tones and the similarity in tone contours between its tones. Automated tone recognition of Cantonese is very challenging. Hilbert-Huang Transform (HHT) is an empirical algorithm that works on non-stationary and nonlinear signals. In this study, the performance of the HHT algorithm on the recognition of Cantonese tones for isolated syllables was examined.
In the first stage of this study, HHT was used as a frequency detection tool for syllables from the CUSYL corpus. The experimental results showed a 25% improvement in the accuracy of the fundamental frequency detection compared with peak picking the performance of the Fast Fourier transform. In the second stage of this study, the accuracy...[
Read more ]
Cantonese is a very popular spoken language/dialect, which is well known for its rich set of nine tones and the similarity in tone contours between its tones. Automated tone recognition of Cantonese is very challenging. Hilbert-Huang Transform (HHT) is an empirical algorithm that works on non-stationary and nonlinear signals. In this study, the performance of the HHT algorithm on the recognition of Cantonese tones for isolated syllables was examined.
In the first stage of this study, HHT was used as a frequency detection tool for syllables from the CUSYL corpus. The experimental results showed a 25% improvement in the accuracy of the fundamental frequency detection compared with peak picking the performance of the Fast Fourier transform. In the second stage of this study, the accuracy of the HHT on the CUSYL corpus was improved through experimentation with various parameters used by the core component of HHT, i.e. the Windowed Average-based Empirical Mode Decomposition (WA-BASED EMD). In the final stage of this study, Support Vector Machines (SVM) were used as binary classification tools. Pitch track information obtained by HHT together with tone information from the CUSYL corpus was used to train a set of 6 SVMs with more than 1,500 syllables. The experimental results showed a 79.08% speaker-independent tone recognition rate for isolated Cantonese syllables.
Post a Comment