Rule-Based Machine Translation for the Italian–Sardinian Language Pair

Open access

Abstract

This paper describes the process of creation of the first machine translation system from Italian to Sardinian, a Romance language spoken on the island of Sardinia in the Mediterranean. The project was carried out by a team of translators and computational linguists. The article focuses on the technology used (Rule-Based Machine Translation) and on some of the rules created, as well as on the orthographic model used for Sardinian.

Armentano-Oller, Carme and Mikel L. Forcada. Open-source machine translation between small languages: Catalan and Aranese Occitan. In 5th SALTMIL workshop on Minority Languages, pages 51–54, 2006.

Beccu, A. and A. Martín-Mor. Sa localizatzione de Facebook in sardu. Revista Tradumàtica, 14, 2017.

Bick, Eckhard and Tino Didriksen. CG-3 – Beyond Classical Constraint Grammar. In Proceedings of the 20th Nordic Conference of Computational Linguistics, NODALIDA 2015, May 11-13, 2015, Vilnius, Lithuania, pages 31–39. Linköping University Electronic Press, Linköpings universitet, 2015.

Cheratzu, Francesco. Sa Chirca. In Mura, Riccardo and Maurizio Virdis, editors, Caratteri e strutture fonetiche, fonologiche e prosodiche della lingua sarda. Il sintetizzatore vocale SINTESA. 2015.

Comitau Scientìficu po sa Norma Campidanesa de su Sardu Standard. Arrègulas po ortografia, fonètica, morfologia e fueddàriu de sa Norma Campidanesa de sa Lìngua Sarda, 2009.

Forcada, M. L., M. Ginestí-Rosell, J. Nordfalk, J. O’Regan, S. Ortiz-Rojas, J. A. Pérez-Ortiz, F. Sánchez-Martínez, G. Ramírez-Sánchez, and F. M. Tyers. Apertium: a free/open-source platform for rule-based machine translation. Machine Translation, 25(2):127–144, 2011.

Levenshtein, Vladimir I. Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Doklady, 10(8):707–710, 1966.

Lewis, M. Paul, editor. Ethnologue: Languages of the World. SIL International, Dallas, TX, USA, sixteenth edition, 2009.

Martín-Mor, A. La localització de l’apli de missatgeria Telegram al sard: l’experiència de Sardware i una aplicació docent. Revista Tradumàtica, 14, 2017.

Martínez Cortés, Juan Pablo, Jim O’Regan, and Francis Tyers. Free/Open Source Shallow-Transfer Based Machine Translation for Spanish and Aragonese. In Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC’12), Istanbul, Turkey, 2012. European Language Resources Association (ELRA).

Oppo, Anna. Conoscere e parlare le lingue locali. In Oppo, Anna, editor, Le lingue dei sardi: una ricerca sociolinguistica, chapter 1, pages 6–45. Regione Autonoma della Sardegna, 2007.

Regione Autonoma della Sardegna. Limba Sarda Comune. Norme linguistiche di riferimento a carattere sperimentale per la lingua scritta dell’Amministrazione regionale, 2006. URL http://www.regione.sardegna.it/documenti/1_72_20060418160308.pdf.

Snover, Matthew, Bonnie Dorr, Richard Schwartz, Linnea Micciulla, and John Makhoul. A Study of Translation Edit Rate with Targeted Human Annotation. In Proceedings of the Conference of the Association for Machine Translation in the Americas, 2006.

Toral, Antonio, Mireia Ginestí-Rosell, and Francis M. Tyers. An Italian to Catalan RBMT system reusing data from existing language pairs. In Sanchez-Martínez, F. and J.A. Perez-Ortiz, editors, Proceedings of the Second International Workshop on Free/Open-Source Rule-Based Machine Translation, pages 77–81, 2011.

Zanchetta, Eros and Marco Baroni. Morph-it! A free corpus-based morphological resource for the Italian language. Corpus Linguistics 2005, 1(1), 2005. ISSN 1747-9398.

The Prague Bulletin of Mathematical Linguistics

The Journal of Charles University

Journal Information

Metrics

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 130 130 33
PDF Downloads 66 66 13