In this paper we describe the use of QuEst, a framework that aims to obtain predictions on the quality of translations, to improve the performance of machine translation (MT) systems without changing their internal functioning. We apply QuEst to experiments with:
i. multiple system translation ranking, where translations produced by different MT systems are ranked according to their estimated quality, leading to gains of up to 2:72 BLEU, 3:66 BLEUs, and 2:17 F1 points;
ii. n-best list re-ranking, where n-best list translations produced by an MT system are reranked based on predicted quality scores to get the best translation ranked top, which lead to improvements on sentence NIST score by 0:41 points;
iii. n-best list combination, where segments from an n-best list are combined using a latticebased re-scoring approach that minimize word error, obtaining gains of 0:28 BLEU points; and
iv. the ITERPE strategy, which attempts to identify translation errors regardless of prediction errors (ITERPE) and build sentence-specific SMT systems (SSSS) on the ITERPE sorted instances identified as having more potential for improvement, achieving gains of up to 1:43 BLEU, 0:54 F1, 2:9 NIST, 0:64 sentence BLEU, and 4:7 sentence NIST points in English to German over the top 100 ITERPE sorted instances.
Biçici, Ergun. The Regression Model of Machine Translation. PhD thesis, Koç University, 2011. Supervisor: Deniz Yuret.
Biçici, Ergun. Feature decay algorithms for fast deployment of accurate statistical machine translation systems. In Proc. of the Eighth Workshop on Statistical Machine Translation, Sofia, Bulgaria, August 2013a. Association for Computational Linguistics.
Biçici, Ergun. Referential translation machines for quality estimation. In Proc. of the Eighth Workshop on Statistical Machine Translation, pages 343-351, Sofia, Bulgaria, August 2013b. Association for Computational Linguistics.
Biçici, Ergun. Domain adaptation for machine translation with instance selection. The Prague Bulletin of Mathematical Linguistics, 103, 2015.
Biçici, Ergun and Deniz Yuret. RegMT system for machine translation, system combination, and evaluation. In Proc. of the Sixth Workshop on Statistical Machine Translation, pages 323-329, Edinburgh, Scotland, July 2011. Association for Computational Linguistics. URL http://www.aclweb.org/anthology/W11-2137.
Biçici, Ergun and Deniz Yuret. Optimizing instance selection for statistical machine translation with feature decay algorithms. IEEE/ACM Transactions On Audio, Speech, and Language Processing (TASLP), 23:339-350, 2015. doi: 10.1109/TASLP.2014.2381882.
Biçici, Ergun, Declan Groves, and Josef van Genabith. Predicting sentence translation quality using extrinsic and language independent features. Machine Translation, 27:171-192, December 2013. ISSN 0922-6567. doi: 10.1007/s10590-013-9138-4.
Biçici, Ergun, Qun Liu, and Andy Way. Parallel FDA5 for fast deployment of accurate statistical machine translation systems. In Proc. of the Ninth Workshop on Statistical Machine Translation, pages 59-65, Baltimore, USA, June 2014. Association for Computational Linguistics.
Björnsson, Carl Hugo. Läsbarhet. Liber, 1968.
Blatz, John, Erin Fitzgerald, George Foster, Simona Gandrabur, Cyril Goutte, Alex Kulesza, Alberto Sanchis, and Nicola Ueffing. Confidence estimation for machine translation. Technical report, 2004.
Bojar, Ondřej, Christian Buck, Chris Callison-Burch, Christian Federmann, Barry Haddow, Philipp Koehn, Christof Monz, Matt Post, Radu Soricut, and Lucia Specia. Findings of the 2013 Workshop on Statistical Machine Translation. In Proc. of the Eighth Workshop on Statistical Machine Translation, pages 1-44, Sofia, Bulgaria, August 2013. Association for Computational Linguistics. URL http://www.aclweb.org/anthology/W13-2201.
Callison-Burch, Chris, Philipp Koehn, Christof Monz, and Omer F. Zaidan. Findings of the 2011 Workshop on Statistical Machine Translation. In Proc. of the Sixth Workshop on Statistical Machine Translation, pages 22-64, Edinburgh, England, July 2011. Association for Computational Linguistics.
Doddington, George. Automatic evaluation of machine translation quality using n-gram cooccurrence statistics. In Proc. of the second international conference on Human Language Technology Research, pages 138-145, San Francisco, CA, USA, 2002. Morgan Kaufmann Publishers Inc.
Guyon, Isabelle, Jason Weston, Stephen Barnhill, and Vladimir Vapnik. Gene selection for cancer classification using support vector machines. Machine Learning, 46(1-3):389-422, 2002.
Koehn, Philipp, Hieu Hoang, Alexandra Birch, Chris Callison-Burch, Marcello Federico, Nicola Bertoldi, Brooke Cowan, Wade Shen, Christine Moran, Richard Zens, Chris Dyer, Ondrej Bojar, Alexandra Constantin, and Evan Herbst. Moses: Open source toolkit for statistical machine translation. In ACL, pages 177-180, Prague, Czech Republic, June 2007.
Mangu, Lidia, Eric Brill, and Andreas Stolcke. Finding consensus in speech recognition: word error minimization and other applications of confusion networks. Computer Speech & Language, 14(4):373-400, 2000.
Papineni, Kishore, Salim Roukos, Todd Ward, and Wei-Jing Zhu. BLEU: a method for automatic evaluation of machine translation. In Proc. of 40th Annual Meeting of the Association for Computational Linguistics, pages 311-318, Philadelphia, Pennsylvania, USA, July 2002. Smola, Alex J. and Bernhard Schölkopf. A tutorial on support vector regression. Statistics and Computing, 14(3):199-222, Aug. 2004. ISSN 0960-3174.
Specia, Lucia, Nicola Cancedda, Marc Dymetman, Marco Turchi, and Nello Cristianini. Estimating the sentence-level quality of machine translation systems. In Proc. of the 13th Annual Conference of the European Association for Machine Translation (EAMT), pages 28-35, Barcelona, Spain, May 2009.
Specia, Lucia, Dhwaj Raj, and Marco Turchi. Machine translation evaluation versus quality estimation. Machine Translation, 24(1):39-50, 2010. ISSN 0922-6567. doi: 10.1007/ s10590-010-9077-2. URL http://dx.doi.org/10.1007/s10590-010-9077-2.
Specia, Lucia, Kashif Shah, Eleftherios Avramidis, and Ergun Biçici. QTLaunchPad deliverable D2.1.3 quality estimation for dissemination, 2013. URL http://www.qt21.eu/launchpad/deliverable/quality-estimation-dissemination.
Specia, Lucia, Kashif Shah, Eleftherios Avramidis, and Ergun Biçici. QTLaunchPad deliverable D2.2.1 quality estimation for system selection and combination, 2014. URL http://www.qt21.eu/launchpad/deliverable/quality-estimation-system-selection-and-combination.
Stolcke, Andreas. Srilm - an extensible language modeling toolkit. In Proc. Intl. Conf. on Spoken Language Processing, pages 901-904, 2002.
The Apache Software Foundation. Lucene, 2014. URL http://lucene.apache.org/.