Annotating the ICE corpora pragmatically – preliminary issues & steps

Martin Weisser 1
  • 1 Guangdong University of Foreign Studies, China


Since the inception of the ICE project in 1990, ICE corpora have been used extensively in the investigation and comparison of varieties of English on different linguistic levels. These levels, however, have so far primarily been restricted to lexis and lexico-grammar, while relatively little has to date been achieved in the investigation of pragmatic strategies used by the speakers in these corpora. One of the main reasons for this shortcoming is a lack of suitable annotation that would make such a detailed pragmatic comparison possible. This paper will propose a suitable model and format for converting and enriching the ICE corpora with different levels of pragmatics-relevant information, as well as discussing the issues involved in this endeavour. And finally, to illustrate the feasibility of this aim, the paper will also include a small case study carried out on a number of files, pointing out how the resulting annotations could later be exploited in pragmatics research.

If the inline PDF is not rendering correctly, you can download the PDF file here.

  • Aijmer, Karin. 2002. English discourse particles: Evidence from a corpus. Amsterdam/Philadelphia: John Benjamins.

  • Aijmer, Karin and Christoph Rühlemann (eds.). 2015. Corpus pragmatics: A handbook. Cambridge: Cambridge University Press.

  • Australian National Corpus. n.d. Accessible at

  • Biber, Douglas, Stig Johansson, Geoffrey Leech, Susan Conrad and Edward Finegan. 1999. Longman grammar of spoken and written English. London: Longman.

  • Fischer, Kerstin (ed.). 2006. Approaches to discourse particles. Amsterdam: Elsevier.

  • Gibbon, Dafydd, Inge Mertinse and Roger Moore (eds.). 2000 Handbook of multimodal and spoken language systems. Dordrecht: Kluwer Academic Publishers.

  • Gut, Ulrike. 2014. ICE Nigeria. Available from

  • Kallen, Jeffrey and John Kirk. 2012. SPICE-Ireland: A user’s guide. Queen’s University Belfast, Trinity College Dublin, and Cló Ollscoil na Banríona.

  • Kirk, John. 2013. Beyond the structural levels of language: An introduction to the SPICE-Ireland corpus and its uses. In J. Cruickshank and R. McColl Millar (eds.). After the storm: Papers from the Forum for Research on the Languages of Scotland and Ulster Triennial Meeting, Aberdeen 2012, 207-232. Aberdeen: Forum for Research on the Languages of Scotland and Ireland.

  • Leech, Geoffrey, Martin Weisser, Andrew Wilson and Martine Grice. 2000. Survey and guidelines for the representation and annotation of dialogue. In D. Gibbon, I. Mertins and R. Moore (eds.). Handbook of multimodal and spoken language systems, 1-101. Dordrecht: Kluwer Academic Publishers.

  • Llamas, Carmen, Louise Mullany and Peter Stockwell (eds.). 2007. The Routledge companion to sociolinguistics. London/New York: Routledge.

  • Nelson, Gerald. 1991. International Corpus of English. Markup manual for spoken texts. University College, London: Survey of English Usage.

  • Nelson, Gerald. 2002. International Corpus of English. Markup manual for written texts. University College, London: Survey of English Usage Accessed 24 September 2015, at:

  • Schiffrin, Deborah. 1987. Discourse markers. Cambridge: Cambridge University Press.

  • Searle, John. 1975. A taxonomy of illocutionary acts. In K. Gunderson (ed.). Language, mind and knowledge (Minnesota Studies in the Philosophy of Science III), 344-369. Minneapolis: University of Minnesota Press.

  • Sedlatschek, Andreas. 2009. Contemporary Indian English: Variation and change. Amsterdam/Philadelphia: John Benjamins.

  • Sinclair, John. 2005. Corpus and text - basic principles. In M. Wynne (ed.). Developing linguistic corpora: A guide to Good Practice. Oxford: Oxbow Books, 1-16. Available online from

  • Stenström, Anna-Brita. 1994. An introduction to spoken interaction. London: Longman.

  • TEI Consortium. 2015. TEI P5: Guidelines for Electronic Text Encoding and Interchange. Version 2.8.0.

  • Tsui, Amy. 1994. English conversation. Oxford: Oxford University Press.

  • Weisser, Martin. 2015. Speech act annotation. In K. Aijmer and C. Rühlemann (eds.). Corpus pragmatics: A handbook, 84-113. Cambridge: Cambridge University Press.

  • Weisser, Martin. 2016a. Practical corpus linguistics: An introduction to corpusbased language analysis. Malden, MA & Oxford: Wiley Blackwell.

  • Weisser, Martin. 2016b. DART - the Dialogue Annotation and Research Tool. Corpus Linguistics & Linguistic Theory, 12(2). DOI 10.1515/cllt-2014-0051.

  • Weisser, Martin. 2016c. Profiling agents & callers: A dual comparison across speaker roles and British vs. American English. In L. Pickering, E. Friginala and S. Staples (eds.). Talking at work: Corpus-based explorations of workplace discourse. London: Palgrave Macmillan.

  • Wichmann, Anne. 2004. The intonation of please-requests: A corpus-based study. Journal of Pragmatics 36: 1521-49.

  • Wong, Deanna, Steve Cassidy and Pam Peters. 2011. Updating the ICE annotation system. Corpora 6 (2): 115-144.

  • World Wide Web Consortium. n.d. Extensible Markup Language. http://


Journal + Issues