协同网络创新平台服务,让科研更成功

Global Source-Aware Statistical Post-Editing for General MT: Sentence Specification via Pseudo-Feedback

Abstract: The automatic post-editing (APE), which can correct the translation errors, is an effective approach to improving machine translation (MT) output quality. This paper proposes a global source-aware SPE model to improve the MT translation quality leveraging pseudo-feedback to achieve the sentence specification. For a given source sentence, some similar sentences are retrieved from a translation memory (TM) as the post-editing data. The data is a set of tri-lingual parallel texts which contain the source sentences and their raw machine translations and their gold references (human translations). The alignments between the raw translation and the references are used to re-examine effectiveness of post-editing phrase pairs of the source-independent SPE model. The selected phrase pairs are applied to polish the raw translations. The experimental results show that our method brings the improvement of 3.78 BLEU score to the original outputs of Google translation, outperforms a source-independent SPE model by 1.09 BLEU points and a local source-aware SPE model by 1.02 BLEU points.