Difference between Written and Spoken Czech: The Case of Verbal Nouns Denoting an Action

Open access

Abstract

The present paper extends understanding of differences in expressing actions by verbal nouns in corpora of written vs. spoken Czech, namely in the Czech part of the Prague Czech-English Dependency Treebank and in the Prague Dependency Treebank of Spoken Czech.

We show that while the written corpus includes more complex noun phrases with more explicit expression of adnominal participants, noun phrases in the spoken corpus contain more deletions and more exophoric references. We also carried out a quantitative analysis focusing on relative frequencies of combinations of participants modifying verbal nouns; although the written corpus shows higher relative frequencies, the order of the relative frequencies of particular combinations is the same in both types of communication.

Biber, Douglas, Stig Johansson, Geoffrey Leech, Susan Conrad, and Edward Finegan. Longman grammar of spoken and written English. Longman, London, 1999.

Brazil, David. A grammar of speech. Oxford University Press, Oxford, 1995.

Chafe, Wallace and Jane Danielewicz. Properties of spoken and written language. In Horowitz, R.; Samuels, S. J., editor, Comprehending Oral and Written Language. Academic Press, New York, 1987.

Čmejrková, Světla, Lucie Jílková, and Petr Kaderka. Mluvená čeština v televizních debatách: korpus DIALOG. Slovo a slovesnost, 65(4):243–269, 2004.

Cruse, D Alan. Lexical semantics. Cambridge University Press, Cambridge, UK, 1986. ISBN 0-521-27643-8.

Cvrček, Václav et al. Mluvnice soucasné ceštiny. Karolinum, Praha, 2010. ISBN 978-80-246-1743-5.

Gernsbacher, Morton Ann. Language comprehension as structure building. Erlbaum, Hillsdale, 1990.

Hajič, Jan, Jarmila Panevová, Zdeňka Urešová, Alevtina Bémová, Veronika Kolářová, and Petr Pajas. PDT-VALLEX: Creating a Large-coverage Valency Lexicon for Treebank Annotation. In Nivre, J.; Hinrichs, E., editor, Proceedings of The Second Workshop on Treebanks and Linguistic Theories, volume 9 of Mathematical Modeling in Physics, Engineering and Cognitive Sciences, pages 57–68. Vaxjo University Press, Vaxjo, Sweden, 2003. ISBN 91-7636-394-5.

Hajič, Jan, Eva Hajičová, Jarmila Panevová, Petr Sgall, Ondřej Bojar, Silvie Cinková, Eva Fučíková, Marie Mikulová, Petr Pajas, Jan Popelka, Jiří Semecký, Jana Šindlerová, Jan Štěpánek, Josef Toman, Zdeňka Urešová, and Zdeněk Žabokrtský. Announcing Prague Czech-English Dependency Treebank 2.0. In Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC 2012), pages 3153–3160. European Language Resources Association, Istanbul, Turkey, 2012. ISBN 978-2-9517408-7-7.

Hajič, Jan, Eva Hajičová, Marie Mikulová, and Jiří Mírovský. Prague Dependency Treebank. In Ide, N.; Pustejovsky, J., editor, Handbook on Linguistic Annotation. Springer Verlag, Berlin, Germany, in press.

Halliday, Michael Alexander Kirkwood. Intonation and grammar in British English. Mouton, Hague, 1967.

Hausenblas, Karel. O studiu syntaxe běžně mluvených projevů. In Otázky slovanské syntaxe: sborník brněnské syntaktické konference, 17.-21. IV. 1961, 1962.

Heyvaert, Liesbet. A cognitive-functional approach to nominalization in English. Walter de Gruyter, Berlin, 2003. ISBN 3-11-017809-5.

Hoffmannová, Jana. Syntaktická stylistika mluvených projevů. In Hoffmannová, J.; Klímová, J., editor, Čeština v pohledu synchronním a diachronním. Karolinum, Praha, 2012.

Hoffmannová, Jana and Olga Müllerová. Čeština v dialogu generací. Academia, Praha, 2007.

Hoffmannová, Jana, Olga Müllerová, and Jiří Zeman. Konverzace v češtině při rodinných a přátelských návštěvách. Trizonia, Praha, 1999.

Hunyadi, Laszlo. Incompleteness and fragmentation in spoken language syntax and its relation to prosody and gesturing: Cognitive processes vs. Possible formal cues. In Cognitive Infocommunications (CogInfoCom), 2013 IEEE 4th International Conference on, pages 211–218, Budapest, Hungary, 2013. IEEE.

Kolářová, Veronika. Special valency behavior of Czech deverbal nouns. In Spevak, O., editor, Noun Valency, pages 19–60. John Benjamins Publishing Company, Amsterdam, The Netherlands, 2014. ISBN 9789027259233.

Kolářová, Veronika. Valence českých deverbativních substantiv reprezentujících vybrané sémantické třídy. Prace Filologiczne, in press. ISSN 0138-0567.

Leech, Geoffrey. Grammars of Spoken English: New Outcomes of Corpus-Oriented Research. Language learning, 50(4):675–724, 2000.

MacWhinney, Brian. The CHILDES project: Tools for analyzing talk. Computational Linguistics, 26(4):657–657, 2000.

Marcus, Mitchell, Beatrice Santorini, Mary Ann Marcinkiewicz, and Ann Taylor. Penn Treebank-3. Linguistic Data Consortium, LDC99T42, University of Pennsylvania, 1999.

Mikulová, Marie. Annotation on the tectogrammatical level. Additions to annotation manual (with respect to PDTSC and PCEDT). Technical Report ÚFAL TR-2013-52, Prague, Czech Republic, ÚFAL MFF UK, 2014.

Mikulová, Marie and Jana Hoffmannová. Korpusy mluvené češtiny a možnosti jejich využití pro poznání rozdílných “světů” mluvenosti a psanosti, pages 78–92. Studie z korpusové lingvistiky. Lidové noviny, Praha, 2011. ISBN 978-80-7422-115-6.

Mikulová, Marie, Alevtina Bémová, Jan Hajič, Eva Hajičová, Jiří Havelka, Veronika Kolářová, Lucie Kučová, Markéta Lopatková, Petr Pajas, Jarmila Panevová, Magda Razímová, Petr Sgall, Jan Štěpánek, Zdeňka Urešová, Kateřina Veselá, and Zdeněk Žabokrtský. Annotation on the tectogrammatical level in the Prague Dependency Treebank. Annotation manual. Technical Report 30, ÚFAL MFF UK, ÚFAL, Prague, Czech Republic, 2006.

Mikulová, Marie, Jan Štěpánek, and Zdeňka Urešová. Liší se mluvené a psané texty ve valenci? Korpus – gramatika – axiologie, 8:36–46, 2013. ISSN 1804-137X.

Mikulová, Marie, Anja Nedoluzhko, Jiří Mírovský, Jan Štěpánek, Petr Pajas, and Jan Hajič. Prague Dependency Treebank of Spoken Czech 2.0. Charles University, Prague Czech Republic, in press.

Müllerová, Olga. Mluvený text a jeho syntaktická výstavba. Academia, Praha, 1994.

Panevová, Jarmila. On verbal frames in functional generative description. Part I. Prague Bulletin of Mathematical Linguistics, 22:3–40, 1974.

Panevová, Jarmila. On verbal frames in functional generative description. Part II. Prague Bulletin of Mathematical Linguistics, 23:17–52, 1975.

Panevová, Jarmila. Ještě k teorii valence. Slovo a slovesnost 59, č.1, 1998.

Panevová, Jarmila. Valence a její univerzální a specifické projevy. In Hladká, Z.; Karlík, P., editor, Čeština - univerzália a specifika. Sborník konference ve Šlapanicích u Brna 17.-18. 11. 1998. Masarykova univerzita v Brně, Brno, 1999.

Panevová, Jarmila. K valenci substantiv (s ohledem na jejich derivaci). Zbornik matice srpske za slavistiku, 61:29–36, 2002.

Panevová, Jarmila. Contribution of Valency to the Analysis of Language. In Spevak, O., editor, Noun Valency, pages 1–18. John Benjamins Publishing Company, Amsterdam, The Netherlands, 2014. ISBN 9789027259233.

Piťha, Petr. Case frames of nouns. In Sgall, P., editor, Contributions to functional syntax, semantics, and language comprehension,, pages 91–99. John Benjamins, Amsterdam, Philadelphia, 1980.

Roberts, Celia and Brian Street. Spoken and written language. The handbook of sociolinguistics, pages 168–186, 1997.

Sagae, Kenji, Brian MacWhinney, and Alon Lavie. Adding Syntactic Annotations to Transcripts of Parent-Child Dialogs. In Proceedings of the Fourth International Conference on Language Resources and Evaluation, LREC 2004. European Language Resources Association, Lisbon, Portugal, 2004.

Schuurman, Ineke, Machteld Schouppe, Heleen Hoekstra, and Ton Van der Wouden. CGN, an annotated corpus of spoken Dutch. In Abeillé, A., S. Hansen-Schirra, and H. Uszkoreit, editors, Proceedings of 4th International Workshop on Linguistically Interpreted Corpora (LINC-03), pages 101–108. Budapest, Hungary, 2003.

Sgall, Petr, Eva Hajičová, and Jarmila Panevová. The meaning of the sentence in its semantic and pragmatic aspects. Reidel, Dordrecht, 1986. ISBN 90-277-1838-5.

Šonková, Jitka. Morfologie mluvené češtiny: frekvenční analýza. Lidové noviny, Praha, 2008.

Štěpánek, Jan and Petr Pajas. Querying Diverse Treebanks in a Uniform Way. In Proceedings of the 7th International Conference on Language Resources and Evaluation (LREC 2010), pages 1828–1835. European Language Resources Association, Valletta, Malta, 2010. ISBN 2-9517408-6-7.

Těšitelová, Marie. Psaná a mluvená odborná čeština z kvantitativního hlediska:(v rámci věcného stylu). Československá akademie věd, Ústav pro jazyk českỳ, 1983.

Urešová, Zdeňka. Building the PDT-VALLEX valency lexicon. In Proceedings of the fifth Corpus Linguistics Conference, pages 1–18. University of Liverpool, Liverpool, UK, 2012.

Zikánová, Šárka, Eva Hajičová, Barbora Hladká, Pavlína Jínová, Jiří Mírovský, Anna Nedoluzhko, Lucie Poláková, Kateřina Rysová, Magdaléna Rysová, and Jan Václ. Discourse and Coherence. From the Sentence Structure to Relations in Text. Studies in Computational and Theoretical Linguistics. ÚFAL, Prague, Czech Republic, 2015. ISBN 978-80-904571-8-8.

The Prague Bulletin of Mathematical Linguistics

The Journal of Charles University

Journal Information

Metrics

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 186 184 16
PDF Downloads 98 98 8