THESIS
2018
xviii, 159 pages : illustrations (some color) ; 30 cm
Abstract
We show that the quality of machine translation (MT) output of different genres (written
news, public speech, etc.), for different output languages (Chinese and English), from
different MT paradigms (phrase-based and hierarchical) and using different optimization
strategies is consistently improved by guiding the MT system to preserve the meaning of
the input sentence using our novel semantic frame based objective function, MEANT, that
better reflects translation utility than commonly used surface form based objective functions,
such as BLEU.
Current MT systems are often able to output fluent, nearly grammatically correct translations
with roughly the correct words but still make glaring errors caused by confusing
semantic roles and failing to express the original meaning of th...[
Read more ]
We show that the quality of machine translation (MT) output of different genres (written
news, public speech, etc.), for different output languages (Chinese and English), from
different MT paradigms (phrase-based and hierarchical) and using different optimization
strategies is consistently improved by guiding the MT system to preserve the meaning of
the input sentence using our novel semantic frame based objective function, MEANT, that
better reflects translation utility than commonly used surface form based objective functions,
such as BLEU.
Current MT systems are often able to output fluent, nearly grammatically correct translations
with roughly the correct words but still make glaring errors caused by confusing
semantic roles and failing to express the original meaning of the input. A useful translation
should be one that helps its reader to accurately understand the original meaning of the
input utterance. However, over the past decade, the development of MT systems has been driven by BLEU and other fast and cheap n-gram surface-form matching based MT evaluation
metrics which fail to reflect translation utility, i.e., the ability of the human reader
to accurately understand the meaning of the input utterance. Even when human judgment
clearly indicates that a translation has serious mistakes in conveying the meaning of the
input utterance, n-gram surface-form matching based evaluation metrics typically register
little difference. Frame semantics capture the essential meaning of a sentence in the basic
event structure—“who did what to whom, for whom, when, where, why and how”. As
the performance of MT systems has plateaued, we argue that it is time for a new semantic
frame based MT evaluation metric that focuses on reflecting the degree of correctness
in meaning of the translation to drive MT systems to produce more adequate and useful
translations.
In this thesis, we first introduce HMEANT, a human-involved semi-automatic semantic
frame based MT evaluation metric, that correlates better with human judgments on translation
adequacy than not only automatic MT evaluation metrics, but also HTER, the state-of-the-art semi-automatic adequacy-oriented MT evaluation metric, at a lower labor cost.
We go on to fully automate HMEANT into MEANT and show that MEANT correlates
with human adequacy judgments better than or as well as the state-of-the-art automatic
MT evaluation metrics in scoring the quality of the MT output against the human reference
translations for a wide range of output languages, Czech, English, German, French,
Hindi, Romanian and Russian, with fewer language-dependent resources and higher score
interpretability. We then show XMEANT, the cross-lingual variant of MEANT that approximates
MEANT by scoring the quality of the MT output against the input sentence when the
costly human reference translations are not available for MT evaluation. Most importantly,
we empirically demonstrate that MT systems optimized against MEANT show improved
translation quality in terms of the most commonly used automatic MT evaluation metrics
across different genres, language pairs, MT paradigms and optimization strategies.
Post a Comment