Genetic analysis of cabbages and related cultivated plants using the bag-of-words model

In this study, we aim to introduce the analytical method bag-of-words, which is mainly used as a tool for the analysis (document classification, authorship attribution and so on; e.g. [1, 2]) of natural languages. Quantitative linguistic methods similar to bag-of-words (e.g. Damerau–Levenshtein distance in the paper by Serva and Petroni [3]) have been used for the mapping of language evolution within the field of glottochronology. We attempt to apply this method in the field of biological taxonomy – on the Brassicaceae (Cruciferae) family. The subjects of our interest are well-known cultivated crops, which at first sight are morphologically very different and culturally perceived as objects of different interests (e.g. oil from oilseed rape, turnip as animal feed and cabbage as a side dish). Despite the phenotypic divergence of these crops, they are very closely related, which is not morphologically obvious at first sight. For this reason, we think that Brassicaceae crops are appropriate illustrative examples for introducing the method. For the analysis, we use genetic markers (internal transcribed spacer [ITS] and maturase K [matK]). Until now, the bag-of-words model has not been used for biological taxonomisation purposes; therefore, the results of the bagof-words analysis are compared with the existing very well-developed Brassica taxonomy. Our goal is to present a method that is suitable for language development reconstruction as well as possibly being usable for biological taxonomy purposes.

eISSN:: 2544-6339
Language:: English

Publication timeframe:: 2 times per year
Journal Subjects:: Cultural Studies, Cultural Theory, Semiotics, Linguistics and Semiotics, Applied Linguistics, Quantitative, Computational, and Corpus Linguistics, Biosemiotics, Theoretical Frameworks and Disciplines, General Linguistics

Journal RSS Feed

Genetic analysis of cabbages and related cultivated plants using the bag-of-words model

Published Online: Aug 02, 2019

Page range: 122 - 132

Received: Nov 19, 2018

Accepted: Jan 22, 2019

DOI: https://doi.org/10.2478/lf-2018-0011

Keywords
bag-of-words, glottochronology, language development, evolution, taxonomy, molecular genetics, Brassicaceae

© 2018 Hana Owsianková et al., published by Sciendo

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License.

Genetic analysis of cabbages and related cultivated plants using the bag-of-words model

Published Online: Aug 02, 2019

Page range: 122 - 132

Received: Nov 19, 2018

Accepted: Jan 22, 2019

DOI: https://doi.org/10.2478/lf-2018-0011

Keywordsbag-of-words, glottochronology, language development, evolution, taxonomy, molecular genetics, Brassicaceae

© 2018 Hana Owsianková et al., published by Sciendo

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License.

Keywords
bag-of-words, glottochronology, language development, evolution, taxonomy, molecular genetics, Brassicaceae