The Dialekt Corpus and Its Possibilities

Hana Goláňová; Martina Waclawičová

Open Access

The Dialekt Corpus and Its Possibilities

Hana Goláňová

and

Martina Waclawičová

| Dec 21, 2019

Journal of Linguistics/Jazykovedný casopis

Volume 70 (2019): Issue 2 (December 2019)

About this article

Cite

Page range: 336 - 344

DOI: https://doi.org/10.2478/jazcas-2019-0063

Keywords
spoken corpus, dialect corpus, dialectology, corpus design, transcription

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.

DIALEKT, a corpus of Czech dialects, has been continuously curated and expanded by the Spoken Corpora section of the Institute of the Czech National Corpus. The following paper aims first to give a concise characteristic of the corpus, addressing its sociolinguistic parameters and possible subcorpora derivable thereof, its two-layer approach to the transcription of dialect recordings, and lemmatization & morphological tagging of the corpus. Subsequently, we move on to examples of how linguists can use the corpus and discuss two related projects which expand upon currently available possibilities: an archive of dialect-specific differential phones of the Czech language (completed) and an interactive web environment for spatial map-based visualization of data from all kinds of spoken corpora (in preparation). Thanks in part also to these additional tools, the DIALEKT corpus should serve both experts in the field as well as the general public.

eISSN:: 1338-4287
ISSN:: 0021-5597
Language:: English

Publication timeframe:: 2 times per year
Journal Subjects:: Linguistics and Semiotics, Theoretical Frameworks and Disciplines, Linguistics, other

Journal RSS Feed

The Dialekt Corpus and Its Possibilities

Published Online: Dec 21, 2019

Page range: 336 - 344

DOI: https://doi.org/10.2478/jazcas-2019-0063

Keywords
spoken corpus, dialect corpus, dialectology, corpus design, transcription

© 2019 Hana Goláňová et al., published by Sciendo

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.

The Dialekt Corpus and Its Possibilities

Published Online: Dec 21, 2019

Page range: 336 - 344

DOI: https://doi.org/10.2478/jazcas-2019-0063

Keywordsspoken corpus, dialect corpus, dialectology, corpus design, transcription

© 2019 Hana Goláňová et al., published by Sciendo

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.

Keywords
spoken corpus, dialect corpus, dialectology, corpus design, transcription