THESIS
2015
xi, 110 pages : illustrations ; 30 cm
Abstract
Multilingual people often code switch mixing two languages in the same sentence
(intra-sentential code-switching) or between sentences (inter-sentential code-switching).
In this thesis, we meet the dual challenge of code switching speech
recognition in terms of both acoustic modeling and language modeling.
The acoustic modeling challenge is due to the lack of labeled code switching data.
We propose a novel asymmetric pronunciation and acoustic modeling approach using
a single set of models trained on a small amount of accented data in the second
language and monolingual data in the main language. We tested our proposed asymmetric
acoustic models on inter-sentential and intra-sentential code-switching test
sets and showed that our approach significantly outperforms previous ap...[
Read more ]
Multilingual people often code switch mixing two languages in the same sentence
(intra-sentential code-switching) or between sentences (inter-sentential code-switching).
In this thesis, we meet the dual challenge of code switching speech
recognition in terms of both acoustic modeling and language modeling.
The acoustic modeling challenge is due to the lack of labeled code switching data.
We propose a novel asymmetric pronunciation and acoustic modeling approach using
a single set of models trained on a small amount of accented data in the second
language and monolingual data in the main language. We tested our proposed asymmetric
acoustic models on inter-sentential and intra-sentential code-switching test
sets and showed that our approach significantly outperforms previous approaches of
using limited amount of code-switched data or using adaptation.
The challenge for language modeling is predicting the code switching point. It
is generally accepted by linguistics that code switching follows the Inversion Transduction Grammar Constraint under which the switching does not violate grammars
of either language. Under another constraint, the Functional Head Constraint, code
switching is forbidden between the functional head and its complements. However,
none of these linguistic constraints has been previously modeled computationally or
incorporated into code switching speech recognition.
We propose a first ever computational approach of incorporating these linguistic
constraints into a statistical language model for speech recognition. We propose
using a weighted finite-state transducer (WFST) framework so that linguistic constraints
such as Inversion Transduction Grammar Constraint and Functional Head
Constraint can be incorporated. We propose first ever statistical code switching
language modeling that integrates the syntactic Inversion Transduction Grammar
Constraint by a chunk segmentation model and a chunk translation model. We also
propose a constrained code switching language model with Functional Head Constraint
obtained by first expanding the search network with a translation model,
and then restrict paths to those permissible using parsing. Experimental results
on lecture speech and lunch conversation datasets show our systems reduce word
error rates compared to the previous approaches. Our proposed approaches delay
code switching boundary decisions to avoid propagated errors. We address the code
switching data scarcity challenge using bilingual data by language borrowing.
Post a Comment