THESIS
2001
viii, 57 leaves : ill. ; 30 cm
Abstract
Speech recognition is the enabling technology allowing humans to communicate with computers using their voices. While many speech recognizers work well in laboratory environment, their performance degrade significantly in real situations. As a result, robust recognition is an important topic. This thesis is about robust acoustic features for robust speech recognition....[
Read more ]
Speech recognition is the enabling technology allowing humans to communicate with computers using their voices. While many speech recognizers work well in laboratory environment, their performance degrade significantly in real situations. As a result, robust recognition is an important topic. This thesis is about robust acoustic features for robust speech recognition.
In my thesis, we propose a set of novel robust acoustic features called the Auditory Spectrum Based Features (ASBF) that are based on the cochlear model of the human auditory system. The ASBF are a set of representative or dominating frequency values chosen to track the formants and the selection scheme of ASBF is based on a second order difference cochlear model (SDCM) and a primary auditory nerve processing model (PANPM). Improved recognition accuracy in noisy environment is possible with the ASBF because typically the signal-to-noise ratios of the ASBF are relatively large in the presence of noise. Our experimental results suggest that the ASBF are considerably more robust than the traditional MFCC features.
Post a Comment