This work-in-progress paper presents the tool CZEDD which enables the user to learn how to predict the meaning of words. The CZEDD consists of (semi) automatic definitions for derived words because a lot of these words have predictable lexical meaning. The tool will be intended for foreigners who learn the Czech language and it could be useful as a dictionary and/or translator in which the definitions based on the word’s structure are stored. Two detailed case examples (the suffix -tel, and the suffix -ista) illustrate the approach.
DIALEKT, a corpus of Czech dialects, has been continuously curated and expanded by the Spoken Corpora section of the Institute of the Czech National Corpus. The following paper aims first to give a concise characteristic of the corpus, addressing its sociolinguistic parameters and possible subcorpora derivable thereof, its two-layer approach to the transcription of dialect recordings, and lemmatization & morphological tagging of the corpus. Subsequently, we move on to examples of how linguists can use the corpus and discuss two related projects which expand upon currently available possibilities: an archive of dialect-specific differential phones of the Czech language (completed) and an interactive web environment for spatial map-based visualization of data from all kinds of spoken corpora (in preparation). Thanks in part also to these additional tools, the DIALEKT corpus should serve both experts in the field as well as the general public.
This paper analyses the adverbs certainly and generally as stancetaking markers. These adverbial devices are said to show authorial stance and to communicate the author’s commitment or detachment towards the information presented, and so they are classified as epistemic adverbs (Alonso-Almeida 2015). For this study, I have selected a corpus of history texts from the Modern English period (1700-1900), as compiled in The Corpus of History English Texts (Crespo and Moskowich 2015), on the basis of which the two evidential adverbs are examined using computer corpus tools, although manual inspection is also employed to assess the meaning of the items in context. The findings suggest that, in this type of scientific articles, the two adverbs are used with differing pragmatic functions, in the case of certainly it functions mostly as a booster and, in the specific case of generally, its use seems to primarily suggest a hedging purpose (Hyland 2005a).
Róbert Sabo, Peter Krammer, Ján Mojžiš and Marcel Kvassay
] Lai, S., Xu, L., Liu, K., and Zhao, J. (2015). Recurrent convolutional neural networks for text classification. In Twenty-ninth AAAI conference on artificial intelligence.
 100 názorov. Accessible at: http://100nazorov.sk/
 Politika. Nový čas. FPD Media. Accessible at: https://www.cas.sk/spravy/politika
 Garabík, R.: Morfologická dezambiguácia. Accessible at: https://morphodita.juls.savba.sk/
 Straková, J., Straka, M., and Hajič, J. (2014). Open-source tools for morphology, lemmatization, POS tagging and named entity
Paula Orzechowska, Janina Mołczanow and Michał Jankowski
Syntax: The Relation between Sound and Structure . Cambridge, MA: MIT Press.
Topintzi, Nina. 2010. Onsets: Suprasegmental and Prosodic Behaviour . Cambridge: Cambridge University Press.
Trnka, Bohumil. 1966. A Phonological Analysis of Present-day Standard English . Alabama: University of Alabama Press.
Zydorowicz, Paulina, Orzechowska, Paula, Jankowski, Michał Dziubalska-Kołaczyk, Katarzyna, Wierzchoń, Piotr and Dawid Pietrala. 2016. Phonotactics and Morphonotactics of Polish and English: Description, Tools and Applications . Poznań: Wydawnictwo
, pages 2825–2830.
 Sedláček, R., and Smrž, P. (2001). A New Czech Morphological Analyser ajka. In International Conference on Text, Speech and Dialogue, TSD 2001, pages 100–107, Berlin, Springer.
 Straková et al. (2014). Open-source tools for morphology, lemmatization, POS tagging and named entity recognition. In Proceedings of ACL 2014: System Demonstrations, pages 13–18.
 Ševčíková, M., and Panevová, J. (2018). Derivation of Czech verbs and the category of aspect. Linguistica Copernicana, 2018(15), pages 79–93.
 Ševčíková, M
Jaroslava Hlaváčová, Marie Mikulová, Barbora Štěpánková and Jan Hajič
. (2016). Universal Dependencies v1: A Multilingual Treebank Collection. In Proceedings of the 10 th International Conference on LREC 2016, pages 1659–1666. Paris.
 Petkevič, V., Hlaváčová, J., Osolsobě, K., Šimandl, J., and Svášek, M. (2019). Microsyntactic Parts of Speech in NovaMorf, a New Morphological Annotation of Czech. In Proceedings of SLOVKO 2019 (this volume).
 Straková J., Straka M., and Hajič J. (2014). Open-Source Tools for Morphology, Lemmatization, POS Tagging and Named Entity Recognition. In Proceedings of 52 nd Annual Meeting of
Marie Kopřivová, Zuzana Komrsková, Petra Poukarová and David Lukeš
Corpus. In Peters, P., Collins, P., and Smith, A. (eds.), New Frontiers of Corpus Research. Amsterdam, pages 105–112.
 Oostdijk, N. et al. (2002). Experiences from the Spoken Dutch Corpus Project. Proceedings of the LREC 2002, pages 340–347.
 Schmidt, T. (2014). The Research and Teaching Corpus of Spoken German – FOLK. In Proceedings of the Ninth International conference on Language Resources and Evaluation (LREC’14), Reykjavik, Iceland: European Language Resources Association (ELRA).
 Allwood, J. et al. (2003). Annotations and Tools for
-Coulthard, C. R., and Moon, R. (2010). ‘Curvy, hunky, kinky’: Using corpora as tools for critical analysis. Discourse & Society, 21(2), pages 99–133.
 Cvrček, V. et al. (2010). Mluvnice současné češtiny. Praha, Karolinum.
 Cvrček, V. (2017). Paradigmatické korpusové dotazy a moderní diachronie. In M. Stluka & M. Škrabal (eds.), Liʃka a czban – Sborník příspěvků k 70. narozeninám prof. Karla Kučery, pages 117–130. Praha, Czech Republic: Nakladatelství Lidové noviny.
 Čmejrková, S. (2003). Communicating gender in Czech. In M. Hellinger, and H
, Kunath, Stephen, Gao, Zhiyan, Luu, Vu and Thao vy Vo. 2017. Transcribing non-native speech: the development of a crowdsourcing tool to evaluate perceptions of accented speech . Presented at the 11th International Conference on Native and Non-native Accents of English, Łódź, Poland.
Wilson, Colin and Lisa Davidson. 2013. Bayesian analysis of non-native cluster production. In Kan, Seda, Moore-Cantwell, Claire, and Robert Staubs (eds.), Proceedings of the Northeast linguistics society 40 . 265–276.