Issues and challenges in compiling a corpus of Early Modern English plays for comparison with those of William Shakespeare

Abstract

In this article I discuss the issues and challenges of compiling a corpus of historical plays by a range of playwrights that is highly suitable for use in comparative, corpus-based research into language style in Shakespeare’s plays. In discussing sources for digitised historical play-texts and criteria for making a selection for the present study, I argue that not just any set of Early Modern English plays constitutes a suitable basis upon which to make reliable claims about language style in Shakespeare’s plays relative to those of his peers. I point out factors outside of authorial choice which potentially have bearing on language style, such as sub-genre features and change over time. I also highlight some particular difficulties in compiling a corpus of historical texts, notably dating and spelling variation, and I explain how these were addressed. The corpus detailed in this article extends the prospects for investigating Shakespeare’s language style by providing a context into which it can be set and, as I indicate, is a valuable new publicly accessible resource for future research.

If the inline PDF is not rendering correctly, you can download the PDF file here.

  • Archer, Dawn and Derek Bousfield. 2010. ‘See better, Lear’? See Lear better! A corpus-based pragma-stylistic investigation of Shakespeare’s King Lear. In D. McIntyre and B. Busse (eds.). Language and style, 183–203. Basing-toke/New York: Palgrave Macmillan.

  • Archer, Dawn and Jonathan Culpeper. 2003. Sociopragmatic annotation: New directions and possibilities in historical corpus linguistics. In A. Wilson, P. Rayson and A.M. McEnery (eds.). Corpus linguistics by the lune: A festschrift for Geoffrey Leech, 37–58. Frankfurt/Main: Peter Lang.

  • Archer, Dawn, Jonathan Culpeper and Paul Rayson. 2009. Love – ‘a familiar or a devil’? An exploration of key domains in Shakespeare’s comedies and tragedies. In D. Archer (ed.). What’s in a word-list? Investigating word frequency and keyword extraction, 137–157. Farnham/Burlington: Ashgate.

  • Archer, Dawn and Mathew Gillings. In preparation. Depictions of deception, focussing on five Shakespearean characters.

  • Archer, Dawn, Merja Kytö, Alistair Baron and Paul Rayson. 2015. Guidelines for normalising Early Modern English corpora: Decisions and justifications. ICAME Journal 39 (1): 5–24.

  • Baker, Paul. 2004. Querying keywords. Questions of difference, frequency, and sense in keywords analysis. Journal of English Linguistics 32 (4): 346–359.

  • Baron, Alistair and Paul Rayson. 2008. VARD 2: A tool for dealing with the spelling variation in historical corpora. In Proceedings of the Postgraduate Conference in Corpus Linguistics, Aston University, Birmingham, U.K., 22 May 2008. http://ucrel.lancs.ac.uk/vard/about/ (accessed 26.09.2019).

  • Bartels, Emily. 2003. Shakespeare’s view of the world. In S. Wells and L. Cowen Orlin (eds.). Shakespeare. An Oxford guide, 151–164. Oxford: Oxford University Press.

  • Biber, Douglas. 1989. A typology of English texts. Linguistics 27 (1): 3–43.

  • Braunmuller, Albert. 2003. Shakespeare’s fellow dramatists. In S. Wells and L. Cowen Orlin (eds.). Shakespeare. An Oxford guide, 55–66. Oxford: Oxford University Press.

  • Bray, Tim, Jean Paoli, Michael Sperberg-McQueen, Eve Maler and François Yergeau (eds.). 2008. Extensible Markup Language (XML) 1.0. Fifth edition. W3C Recommendation 26 November 2008. https://www.w3.org/XML/ (accessed 26.09.2019).

  • Brown, Meaghan, Michael Poston and Elizabeth Williamson (eds.). 2020. A digital anthology of Early Modern English drama. Folger Shakespeare Library. http://emed.folger.edu (accessed 26.09.2019).

  • Brown, Roger and Albert Gilman. 1989. Politeness theory and Shakespeare’s four major tragedies. Language in Society 18: 159–212.

  • Busse, Beatrix. 2006. Vocative constructions in the language of Shakespeare. Amsterdam/Philadelphia: John Benjamins.

  • Busse, Ulrich. 2002. Linguistic variation in the Shakespeare Corpus: Morpho-syntactic variability of second person pronouns. Amsterdam/Philadelphia: John Benjamins.

  • CED = A Corpus of English Dialogues 1560–1760. 2006. Compiled by Merja Kytö (Uppsala University) and Jonathan Culpeper (Lancaster University).

  • Craig, Hugh. 1999. Jonsonian chronology and the styles of A Tale of a Tub. In M. Butler (ed.). Presenting Ben Jonson: Text, history, performance, 210–232. Basingstoke: Palgrave Macmillan U.K.

  • Craig, Hugh and Arthur Kinney (eds.). 2009. Shakespeare, computers, and the mystery of authorship. Cambridge: Cambridge University Press.

  • Crystal, David. 2003. The language of Shakespeare. In S. Wells and L. Cowen Orlin (eds.). Shakespeare. An Oxford guide, 67–78. Oxford: Oxford University Press.

  • Crystal, David. 2008. Think on my words. Exploring Shakespeare’s language. Cambridge: Cambridge University Press.

  • Crystal, David and Ben Crystal. 2005. The Shakespeare miscellany. London: Penguin.

  • Culpeper, Jonathan. 2009. Keyness: Words, parts-of-speech and semantic categories in the character-talk of Shakespeare’s Romeo and Juliet. International Journal of Corpus Linguistics 14 (1): 29–59.

  • Culpeper, Jonathan. 2011. A new kind of dictionary for Shakespeare’s plays: An immodest proposal. In M. Ravassat and J. Culpeper (eds.). Stylistics and Shakespeare’s language. Transdisciplinary approaches, 58–83. London/ New York: Continuum.

  • Culpeper, Jonathan and Jane Demmen. 2015. Keywords. In D. Biber and R. Reppen (eds.). The Cambridge handbook of English corpus linguistics, 90–105. Cambridge: Cambridge University Press.

  • Culpeper, Jonathan and Alison Findlay. In preparation. Contemporary understandings of Welsh, Scottish and Irish identities: Celtic characters in Shakespeare’s Henry V.

  • Culpeper, Jonathan and Merja Kytö. 2010. Early Modern English dialogues: Spoken interaction as writing. Cambridge: Cambridge University Press.

  • Culpeper, Jonathan and Dan McIntyre. 2006. Drama: Stylistic aspects. In K. Brown (ed.). Encyclopedia of language and linguistics. Volume 3, 772–785. 2nd edition. Oxford: Elsevier.

  • Demmen, Jane. 2016. Smoothing out spelling variation. Blog post 22.10.2016. http://wp.lancs.ac.uk/shakespearelang/2016/10/22/smoothin-out-spelling-variation/ (accessed 26.09.2019).

  • Demmen, Jane. 2018. Is that a verb I see before me? Implementing grammatical category/part-of-speech tagging in the Shakespeare Corpus. Blog post 20.06.2018. http://wp.lancs.ac.uk/shakespearelang/2018/06/20/is-that-a-verb-i-see-before-me-implementing-grammatical-category-part-of-speech-tagging-in-the-shakespeare-corpus/ (accessed 26.09.2019).

  • Dillon, Janette. 2003. Shakespeare and the traditions of English stage comedy. In R. Dutton and J.E. Howard (eds.). A companion to Shakespeare’s works. Volume III. The comedies, 4–22. Malden/Oxford/Victoria: Blackwell.

  • Dutton, Richard. 1991. Mastering the revels. Basingstoke/London: Macmillan.

  • Dutton, Richard. 2000. Licensing, censorship and authorship in Early Modern England. Basingstoke/New York: Palgrave.

  • EEBO-TCP = Early English Books Online-Text Creation Partnership. 2020. https://www.textcreationpartnership.org/ (accessed 26.09.2019).

  • Fischer-Starcke, Bettina. 2009. Keywords and frequent phrases of Jane Austen’s Pride and Prejudice. A corpus-stylistic analysis. International Journal of Corpus Linguistics 14 (4): 492–523.

  • Greenblatt, Stephen, Walter Cohen, Suzanne Gossett, Jean Howard, Katherine Eisaman Maus and Gordon McMullan (eds.). 2016. The Norton Shakespeare. 3rd Edition. London/New York: W.W. Norton and Company.

  • Hardie, Andrew. 2012. CQPweb – combining power, flexibility and usability in a corpus analysis tool. International Journal of Corpus Linguistics 17 (3): 380–409.

  • Hardie, Andrew. 2014. Modest XML for corpora: Not a standard, but a suggestion. ICAME Journal 38: 73–103.

  • Helsinki Corpus = The Helsinki Corpus of English texts. 1991. Compiled by Matti Rissanen (Project leader), Merja Kytö (Project secretary); Leena Kahlas-Tarkka and Matti Kilpiö (Old English); Saara Nevanlinna and Irma Taavitsainen (Middle English); Terttu Nevalainen and Helena Raumolin-Brunberg (Early Modern English). Helsinki: Department of English, University of Helsinki.

  • Hope, Jonathan. 1994. The authorship of Shakespeare’s plays. Cambridge: Cambridge University Press.

  • Hope, Jonathan. 2010. Shakespeare and language: Reason, eloquence and artifice in the Renaissance. London: Arden Shakespeare.

  • Hope, Jonathan and Michael Witmore. 2010. The hundredth psalm to the tune of ‘Green Sleeves’: Digital approaches to the language of genre. Shakespeare Quarterly 61 (3): 357–390.

  • KEMPE = Korpus of Early Modern Playtexts in English. Initially compiled by Lene B. Petersen and Marcus X. Dahl, in association with Visual Interactive Syntax Learning (VISL), Southern Denmark University (SDU), 2001–2003. The fully searchable version of the corpus was prepared by Lene B. Petersen and Eckhard Bick, July 2004.

  • Kermode, Frank. 2000. Shakespeare’s language. London: Penguin.

  • Kopytko, Roman. 1995. Linguistic politeness in Shakespeare’s plays. In A.H. Jucker (ed.). Historical pragmatics. Pragmatic developments in the history of English, 515–540. Amsterdam/Philadelphia: John Benjamins.

  • Kytö, Merja. 1996 [1991]. Manual to the diachronic part of the Helsinki Corpus of English Texts. Coding conventions and lists of source texts. 3rd edition. Helsinki: Department of English, University of Helsinki.

  • Kytö, Merja and Terry Walker. 2006. Guide to A Corpus of English Dialogues 1560–1760 (Studia Anglistica Upsaliensia 130). Uppsala: Acta Universitatis Upsaliensis.

  • Lee, David. 2001. Genres, registers, text types, domains and styles: Clarifying the concepts and navigating a path through the BNC jungle. Language Learning and Technology 5(3): 37–72.

  • Leech, Geoffrey, Roger Garside and Michael Bryant. 1994. CLAWS 4: The tagging of the British National Corpus. In Proceedings of the 15th International Conference on Computational Linguistics (COLING 94), Kyoto, Japan, 622–628. http://ucrel.lancs.ac.uk/papers/coling1994paper.pdf (accessed 26.09.2019).

  • Levin, Carole. 2003. The society of Shakespeare’s England. In S. Wells and L. Cowen Orlin (eds.). Shakespeare. An Oxford guide, 93–102. Oxford: Oxford University Press.

  • Lutzky, Ursula. 2012. Discourse markers in Early Modern English. Amsterdam/ Philadelphia: John Benjamins.

  • Mullan, John. 2016. An introduction to Shakespeare’s comedy. British Library online article published 15.03.2016. https://www.bl.uk/shakespeare/articles/an-introduction-to-shakespeares-comedy (accessed 26.09.2019).

  • Munro, Lucy. 2005. Children of the Queen’s Revels. A Jacobean theatre repertory. Cambridge: Cambridge University Press.

  • Murphy, Sean. 2017. Shakespeare and social status. Blog post 05.06.2017. http://wp.lancs.ac.uk/shakespearelang/2017/06/05/shakespeare-and-social-status/ (accessed 26.09.2019).

  • Murphy, Sean. 2019. Shakespeare and his contemporaries: Designing a genre classification scheme for Early English Books Online 1560–1640. ICAME Journal 19: 59–82.

  • Murphy, Sean, Jane Demmen, Alison Findlay and Dawn Archer. In preparation. Mapping the links between gender, status and genre in Shakespeare’s plays.

  • Nevalainen, Terttu. 2006. An introduction to Early Modern English. Edinburgh: Edinburgh University Press.

  • Orlin, Lena Cowen. 2003. Part II Shakespearean genres: Introduction. In S. Wells and L. Cowen Orlin (eds.). Shakespeare. An Oxford guide, 167–174. Oxford: Oxford University Press.

  • Petersen, Lena. 2010. Shakespeare’s errant texts. Textual form and linguistic style in Shakespearean ‘bad’ quartos and co-authored plays. Cambridge: Cambridge University Press.

  • Rayson, Paul, Dawn Archer, Scott Piao and Tony McEnery. 2004. The UCREL semantic analysis system. In Proceedings of the workshop on Beyond Named Entity Recognition Semantic labelling for NLP tasks in association with 4th International Conference on Language Resources and Evaluation (LREC 2004), 25 May 2004, Lisbon, Portugal, 7–12. Paris: European Language Resources Association.

  • Rayson, Paul. 2008. From key words to key semantic domains. International Journal of Corpus Linguistics 13 (4): 519–549.

  • Rutter, Carol Chillington. 2012. Playing with boys on Middleton’s stage – and ours. In G. Taylor and T.T. Henley (eds.). The Oxford handbook of Thomas Middleton, 98–115. Oxford: Oxford University Press.

  • Scott, Mike. 2009. In search of a bad reference corpus. In D. Archer (ed.). What’s in a word-list? Investigating word frequency and keyword extraction, 79–91. Oxford: Ashgate.

  • Scott, Mike and Chris Tribble. 2006. Textual patterns. Key words and corpus analysis in language education. Amsterdam/Philadelphia: John Benjamins.

  • Shapiro, Michael. 2002. Boy companies and private theatres. In A.F. Kinney (ed.). A companion to Renaissance drama, 314–325. Oxford/Malden: Blackwell.

  • Short, Mick. 1996. Exploring the language of poems, plays and prose. London/ New York: Longman.

  • Taavitsainen, Irma. 2001. Changing conventions of writing: The dynamics of genres, text types, and text traditions. European Journal of English Studies 5(2): 139–150.

  • Thomson, Peter. 2003. Conventions of playwriting. In S. Wells and L. Cowen Orlin (eds.). Shakespeare. An Oxford guide, 44–54. Oxford: Oxford University Press.

  • Wells, Stanley, Gary Taylor, John Jowett and William Montgomery. 1987. William Shakespeare: A textual companion. Oxford: Clarendon Press.

  • Westfall, Suzanne. 2002. “What revels are in hand?” Performances in the great households. In A.F. Kinney (ed.). A companion to Renaissance drama, 266–280. Oxford/Malden: Blackwell.

OPEN ACCESS

Journal + Issues

Search