Browse

You are looking at 81 - 90 of 93 items for :

  • Quantitative, Computational, and Corpus Linguistics x
Clear All
Open access

Andrew Hardie

Abstract

This paper argues for, and presents, a modest approach to XML encoding for use by the majority of contemporary linguists who need to engage in corpus construction. While extensive standards for corpus encoding exist - most notably, the Text Encoding Initiative’s Guidelines and the Corpus Encoding Standard based on them - these are rather heavyweight approaches, implicitly intended for major corpus-building projects, which are rather different from the increasingly common efforts in corpus construction undertaken by individual researchers in support of their personal research goals. Therefore, there is a clear benefit to be had from a set of recommendations (not a standard) that outlines general best practices in the use of XML in corpora without going into any of the more technical aspects of XML or the full weight of TEI encoding. This paper presents such a set of suggestions, dubbed Modest XML for Corpora, and posits that such a set of pointers to a limited level of XML knowledge could work as part of the normal, general training of corpus linguists.

The Modest XML recommendations cover the following set of things, which, according to the foregoing argument, are sufficient knowledge about XML for most corpus linguists’ day-to-day needs: use of tags; adding attribute value pairs; recommended use of attributes; nesting of tags; encoding of special characters; XML well-formedness; a collection of de facto standard tags and attributes; going beyond the basic de facto standard tags; and text headers.

Open access

Ute Römer, Audrey Roberson, Matthew B. O’Donnell and Nick C. Ellis

Abstract

This paper combines data from learner corpora and psycholinguistic experiments in an attempt to find out what advanced learners of English (first language backgrounds German and Spanish) know about a range of common verbargument constructions (VACs), such as the ‘V about n’ construction (e.g. she thinks about chocolate a lot). Learners’ dominant verb-VAC associations are examined based on evidence retrieved from the German and Spanish subcomponents of ICLE and LINDSEI and collected in lexical production tasks in which participants complete VAC frames (e.g. ‘he ___ about the...’) with verbs that may fill the blank (e.g. talked, thought, wondered). The paper compares findings from the different data sets and highlights the value of linking corpus and experimental evidence in studying linguistic phenomena

Open access

Irma Taavitsainen, Turo Hiltunen, Anu Lehto, Ville Marttila, Päivi Pahta, Maura Ratia, Carla Suhr and Jukka Tyrkkö