Translation memories (TMs), as part of Computer Assisted Translation (CAT) tools, support translators reusing portions of formerly translated text. Fencing books are good candidates for using TMs due to the high number of repeated terms. Medieval texts suffer a number of drawbacks that make hard even “simple” rewording to the modern version of the same language. The analyzed difficulties are: lack of systematic spelling, unusual word orders and typos in the original. A hypothesis is made and verified that even simple modernization increases legibility and it is feasible, also it is worthwhile to apply translation memories due to the numerous and even extremely long repeated terms. Therefore, methods and algorithms are presented 1. for automated transcription of medieval texts (when a limited training set is available), and 2. collection of repeated patterns. The efficiency of the algorithms is analyzed for recall and precision.
If the inline PDF is not rendering correctly, you can download the PDF file here.
[i] Hannsen Lecküchner von Nurenberg  “künst vnd zedel ym messer“ manuscript
[ii] Lorbeer C. Lorbeer J. - Heim J. Brunner R. - Kiermayer A.  Das ist Herr hannsen Lecküchner von Nurenberg künst vnd zedel ym messer Wissenschaftliche Fassung mit Kennzeichnung der aufgelösten Abbreviaturen Transkription der Fechthandschrift cgm582; revised edition January 2006 http://www.pragmatische-schriftlichkeit.de/transkription/trans_cgm582_w_d.pdf
[iii] West J. . “Early New High German - English Dictionary” Electronic edition according to TEI P5 available at: http://www.germanstudies.org.uk/enhg_dic/enhg_dic_intro.htm
[iv] Bollmann M. - Dipper S. - Krasselt J. - Petran F.  “Manual and Semi-automatic Normalization of Historical Spelling - Case Studies from Early New High German” 2012 in Proceedings of KONVENS 2012 p. 342-350
[v] Melamed I. D.  “Automatic Construction of Clean Broad-Coverage Translation Lexicons” in 2nd Conference of the Association for Machine Translation in the Americas (AMTA) Montreal PQ
[vi] Melamed I. D.  “A Portable Algorithm for Mapping Bitext Correspondence” in 35th Annual Meeting of the Association for Computational Linguistics p. 305-312
[vii] Vogel S. - Ney H. - Tillmann. C  “HMM-based word alignment in statistical translation” in: Proceedings of COLING pages 836-841.
[viii] Agrawal R. - Mannila H. -Srikant R. - Toivonen H - Verkamo A. I. “Fast discovery of association rules” in Advances in Knowledge Discovery and Data Mining. AAAI/MIT Press pp. 307-328.
[ix] Han J. - Pei J. - Yin Y. - Mao R.  “Mining frequent patterns without candidate generation: A frequent-pattern tree approach” in Data Min. Knowl. Discov. vol. 8 no. 1 pp. 53-87