THESIS
1995
xi, 108 leaves : ill. ; 30 cm
Abstract
We devise and compare two heuristic search strategies to guide the search for the most probable translation given an input sentence and a precomputecl set of statistical parameters. The statistical learning approach to machine translation has recently been made possible by the growing availability of bilingual corpora and the increasing power of computers, and preliminary successes have been shown for French-English translation. Yet the large number of candidate translation hypotheses makes simple search algorithms infeasibly slow; the task of finding a translation string E that maximizes the value Pr(E)Pr(C/E) given an input string C involves an exponential search space. Therefore a systematic and efficient suboptimal approach must be employed in searching for a target translation of...[
Read more ]
We devise and compare two heuristic search strategies to guide the search for the most probable translation given an input sentence and a precomputecl set of statistical parameters. The statistical learning approach to machine translation has recently been made possible by the growing availability of bilingual corpora and the increasing power of computers, and preliminary successes have been shown for French-English translation. Yet the large number of candidate translation hypotheses makes simple search algorithms infeasibly slow; the task of finding a translation string E that maximizes the value Pr(E)Pr(C/E) given an input string C involves an exponential search space. Therefore a systematic and efficient suboptimal approach must be employed in searching for a target translation of an input string.
Our first heuristic search strategy employs a pure beam search on the translation hypothesis space, with no additional pruning or linguistic knowledge. AIthough reasonable translation accuracies were obtained, we found the overall speed to be relatively slow, and we also observed cases where accuracy could clearly be improved. The search was sometimes led to explore implausible subspaces instead of pruning them, because the baseline translation model we have been using ignores any grouping structure of words in the input string. In our second heuristic search strategy, we introduce a hybrid approxh of augmenting the input string with a simple bracketing to guide the search. This results in improved performance in terms of the efficiency and the accuracy of the translated ooutput.
Post a Comment