THESIS
2010
xi, 58 p. : ill. ; 30 cm
Abstract
In a paper published by Greenberg in 1998, it was said that in conversational speech, phone deletion rate may go as high as 12%. On the other hand, Jurafsky reported in 2001 that phone deletions cannot be modeled well by traditional triphone training. These findings motivate us to model phone deletions explicitly in current ASR systems. In this thesis, phone deletions are modeled by adding skip arcs to the word models. In order to cope with the limitations of using whole word models, context-dependent fragmented word models(CD-FWMs) are proposed. Our proposed method is evaluated on both read speech (Wall Street Journal) and conversational speech (SVitchboard) task. In the read speech evaluation, we obtained a word error rate reduction of about 11%. Although the improvement in conversati...[
Read more ]
In a paper published by Greenberg in 1998, it was said that in conversational speech, phone deletion rate may go as high as 12%. On the other hand, Jurafsky reported in 2001 that phone deletions cannot be modeled well by traditional triphone training. These findings motivate us to model phone deletions explicitly in current ASR systems. In this thesis, phone deletions are modeled by adding skip arcs to the word models. In order to cope with the limitations of using whole word models, context-dependent fragmented word models(CD-FWMs) are proposed. Our proposed method is evaluated on both read speech (Wall Street Journal) and conversational speech (SVitchboard) task. In the read speech evaluation, we obtained a word error rate reduction of about 11%. Although the improvement in conversational speech is modest, reasons are given and relevant analyses are carried out.
Post a Comment