Search Results

1 - 10 of 279 items :

  • Linguistics and Semiotics x
Clear All
Processing of Derivational Features for (Semi)Automatic Creation of Dictionary Definitions in the User Interface (CZEDD) for Learning Czech as a Second Language: Suffix -tel and -ista

Abstract

This work-in-progress paper presents the tool CZEDD which enables the user to learn how to predict the meaning of words. The CZEDD consists of (semi) automatic definitions for derived words because a lot of these words have predictable lexical meaning. The tool will be intended for foreigners who learn the Czech language and it could be useful as a dictionary and/or translator in which the definitions based on the word’s structure are stored. Two detailed case examples (the suffix -tel, and the suffix -ista) illustrate the approach.

Open access
The Dialekt Corpus and Its Possibilities

Abstract

DIALEKT, a corpus of Czech dialects, has been continuously curated and expanded by the Spoken Corpora section of the Institute of the Czech National Corpus. The following paper aims first to give a concise characteristic of the corpus, addressing its sociolinguistic parameters and possible subcorpora derivable thereof, its two-layer approach to the transcription of dialect recordings, and lemmatization & morphological tagging of the corpus. Subsequently, we move on to examples of how linguists can use the corpus and discuss two related projects which expand upon currently available possibilities: an archive of dialect-specific differential phones of the Czech language (completed) and an interactive web environment for spatial map-based visualization of data from all kinds of spoken corpora (in preparation). Thanks in part also to these additional tools, the DIALEKT corpus should serve both experts in the field as well as the general public.

Open access
An analysis of certainly and generally in Late-Modern English English history texts

Abstract

This paper analyses the adverbs certainly and generally as stancetaking markers. These adverbial devices are said to show authorial stance and to communicate the author’s commitment or detachment towards the information presented, and so they are classified as epistemic adverbs (Alonso-Almeida 2015). For this study, I have selected a corpus of history texts from the Modern English period (1700-1900), as compiled in The Corpus of History English Texts (Crespo and Moskowich 2015), on the basis of which the two evidential adverbs are examined using computer corpus tools, although manual inspection is also employed to assess the meaning of the items in context. The findings suggest that, in this type of scientific articles, the two adverbs are used with differing pragmatic functions, in the case of certainly it functions mostly as a booster and, in the specific case of generally, its use seems to primarily suggest a hedging purpose (Hyland 2005a).

Open access
Identification of Spontaneous Spoken Texts in Slovak

] Lai, S., Xu, L., Liu, K., and Zhao, J. (2015). Recurrent convolutional neural networks for text classification. In Twenty-ninth AAAI conference on artificial intelligence. [5] 100 názorov. Accessible at: http://100nazorov.sk/ [6] Politika. Nový čas. FPD Media. Accessible at: https://www.cas.sk/spravy/politika [7] Garabík, R.: Morfologická dezambiguácia. Accessible at: https://morphodita.juls.savba.sk/ [8] Straková, J., Straka, M., and Hajič, J. (2014). Open-source tools for morphology, lemmatization, POS tagging and named entity

Open access
Prosodically-conditioned Syllable Structure in English

Syntax: The Relation between Sound and Structure . Cambridge, MA: MIT Press. Topintzi, Nina. 2010. Onsets: Suprasegmental and Prosodic Behaviour . Cambridge: Cambridge University Press. Trnka, Bohumil. 1966. A Phonological Analysis of Present-day Standard English . Alabama: University of Alabama Press. Zydorowicz, Paulina, Orzechowska, Paula, Jankowski, Michał Dziubalska-Kołaczyk, Katarzyna, Wierzchoń, Piotr and Dawid Pietrala. 2016. Phonotactics and Morphonotactics of Polish and English: Description, Tools and Applications . Poznań: Wydawnictwo

Open access
Introducing Semantic Labels into the DeriNet Network

, pages 2825–2830. [22] Sedláček, R., and Smrž, P. (2001). A New Czech Morphological Analyser ajka. In International Conference on Text, Speech and Dialogue, TSD 2001, pages 100–107, Berlin, Springer. [23] Straková et al. (2014). Open-source tools for morphology, lemmatization, POS tagging and named entity recognition. In Proceedings of ACL 2014: System Demonstrations, pages 13–18. [24] Ševčíková, M., and Panevová, J. (2018). Derivation of Czech verbs and the category of aspect. Linguistica Copernicana, 2018(15), pages 79–93. [25] Ševčíková, M

Open access
Modifications of the Czech Morphological Dictionary for Consistent Corpus Annotation

. (2016). Universal Dependencies v1: A Multilingual Treebank Collection. In Proceedings of the 10 th International Conference on LREC 2016, pages 1659–1666. Paris. [12] Petkevič, V., Hlaváčová, J., Osolsobě, K., Šimandl, J., and Svášek, M. (2019). Microsyntactic Parts of Speech in NovaMorf, a New Morphological Annotation of Czech. In Proceedings of SLOVKO 2019 (this volume). [13] Straková J., Straka M., and Hajič J. (2014). Open-Source Tools for Morphology, Lemmatization, POS Tagging and Named Entity Recognition. In Proceedings of 52 nd Annual Meeting of

Open access
Relevant Criteria for Selection of Spoken Data: Theory Meets Practice

Corpus. In Peters, P., Collins, P., and Smith, A. (eds.), New Frontiers of Corpus Research. Amsterdam, pages 105–112. [20] Oostdijk, N. et al. (2002). Experiences from the Spoken Dutch Corpus Project. Proceedings of the LREC 2002, pages 340–347. [21] Schmidt, T. (2014). The Research and Teaching Corpus of Spoken German – FOLK. In Proceedings of the Ninth International conference on Language Resources and Evaluation (LREC’14), Reykjavik, Iceland: European Language Resources Association (ELRA). [22] Allwood, J. et al. (2003). Annotations and Tools for

Open access
Gender-Specific Adjectives in Czech Newspapers and Magazines

-Coulthard, C. R., and Moon, R. (2010). ‘Curvy, hunky, kinky’: Using corpora as tools for critical analysis. Discourse & Society, 21(2), pages 99–133. [6] Cvrček, V. et al. (2010). Mluvnice současné češtiny. Praha, Karolinum. [7] Cvrček, V. (2017). Paradigmatické korpusové dotazy a moderní diachronie. In M. Stluka & M. Škrabal (eds.), Liʃka a czban – Sborník příspěvků k 70. narozeninám prof. Karla Kučery, pages 117–130. Praha, Czech Republic: Nakladatelství Lidové noviny. [8] Čmejrková, S. (2003). Communicating gender in Czech. In M. Hellinger, and H

Open access
Which phonetic features should pronunciation Instructions focus on? An evaluation on the accentedness of segmental/syllable errors in L2 speech

, Kunath, Stephen, Gao, Zhiyan, Luu, Vu and Thao vy Vo. 2017. Transcribing non-native speech: the development of a crowdsourcing tool to evaluate perceptions of accented speech . Presented at the 11th International Conference on Native and Non-native Accents of English, Łódź, Poland. Wilson, Colin and Lisa Davidson. 2013. Bayesian analysis of non-native cluster production. In Kan, Seda, Moore-Cantwell, Claire, and Robert Staubs (eds.), Proceedings of the Northeast linguistics society 40 . 265–276.

Open access