Charting orthographical reliability in a corpus of English historical letters

Open access


Research into orthography in the history of English is not a simple venture. The history of English spelling is primarily based on printed texts, which fail to capture the range of variation inherent in the language; many manuscript phenomena are simply not found in printed texts. Manuscript-based corpora would be the ideal research data, but as this is resource-intensive, linguists use editions that have been produced by non-linguists. Many editions claim to retain original spellings, but in practice text is always normalized at the graph level and possibly more so. This does not preclude using such a corpus for orthographical research, but there has been no systematic way to determine the philological reliability of an edited text. In this paper we present a typological methodology we are developing for the evaluation of orthographical quality of edition-based corpora, with the aim of making the best use of bad data in the context of editions and manuscript practices. As a case study, we apply this methodology to the Early Modern and Late Modern English sections of the Corpus of Early English Correspondence.

If the inline PDF is not rendering correctly, you can download the PDF file here.

  • The Bluestocking Corpus: Private Correspondence of Elizabeth Montagu 1730s–1780s. First version. Edited by Anni Sairio XML encoding by Ville Marttila. Department of Modern Languages University of Helsinki. 2017.

  • Boyer Paul and Stephen Nissenbaum. 1977. The Salem witchcraft papers: Verbatim transcripts of the legal documents of Salem witchcraft outbreak of 1692. 3 vols. New York: Da Capo Press.

  • Early Modern Letters Online (EMLO). Cultures of Knowledge Bodleian Libraries University of Oxford.

  • Electronic Enlightenment. Oxford: Bodleian Libraries University of Oxford.

  • An Electronic Text Edition of Depositions 1560–1760 (ETED). 2011. Compiled by Merja Kytö Peter J. Grund and Terry Walker. Available on the CD accompanying Merja Kytö Peter J. Grund and Terry Walker (eds.) Testifying to language and life in Early Modern England. Amsterdam/Philadelphia: John Benjamins.

  • Fulk Robert D. 2017. Philological coda. Noise: An appreciation. English Language and Linguistics 21 (2): 431–438.

  • Graham Walter (ed.). 1941. The letters of Joseph Addison. Oxford: Clarendon Press.

  • Grund Peter Merja Kytö and Matti Rissanen. 2004. Editing the Salem Witchcraft records: An exploration of a linguistic treasury. American Speech 79 (2): 146–166.

  • Kaislaniemi Samuli. 2017. Reconstructing merchant multilingualism: Lexical studies of early English East India Company correspondence. PhD thesis University of Helsinki.

  • Kaislaniemi Samuli Mel Evans Teo Juvonen and Anni Sairio. 2017. ‘A graphic system which leads its own linguistic life’? Epistolary spelling in English 1400–1800. In T. Säily A. Nurmi M. Palander-Collin and A. Auer (eds.). Exploring future paths for historical sociolinguistics (Advances in Historical Sociolinguistics 7) 187–214. Amsterdam: John Benjamins.

  • Keränen Jukka. 1998. Forgeries and one-eyed bulls: Editorial questions in corpus work. Neuphilologische Mitteilungen 99 (2): 217–226.

  • Nevala Minna and Arja Nurmi. 2013. The Corpora of Early English Correspondence (CEEC400). In A. Meurman-Solin and J. Tyrkkö (eds.). Principles and practices for the digital editing and annotation of diachronic data (Studies in Variation Contacts and Change in English 14). Helsinki: VARIENG.

  • Nevalainen Terttu and Helena Raumolin-Brunberg. 2016. Historical sociolinguistics: Language change in Tudor and Stuart England. 2nd ed. New York: Routledge.

  • Nevalainen Terttu. 1999. Making the best use of ‘bad’ data: Evidence for sociolinguistic variation in Early Modern English. Neuphilologische Mitteilungen 100 (4): 499–533.

  • Nurmi Arja (ed.). 1998. Manual for the Corpus of Early English Correspondence Sampler CEECS. Helsinki: Department of English University of Helsinki. Available at

  • Oldireva Gustafsson Larisa. 2002. Preterite and past participle forms in English 1680–1790: Standardisation processes in public and private writing. Uppsala: Acta Universitatis Upsaliensis.

  • Osselton Noel. 1984. Informal spelling styles in Early Modern English: 1500–1800. In N.F. Blake and C. Jones (eds.). English historical linguistics. Studies in development 123–137. Sheffield: Department of English Language University of Sheffield.

  • Raumolin-Brunberg Helena and Terttu Nevalainen. 2007. Historical sociolinguistics: The Corpus of Early English Correspondence. In J.C. Beal K.P. Corrigan and H.L. Moisl (eds.). Creating and digitizing language corpora. Vol. 2: Diachronic databases 148–171. Houndsmills: Palgrave Macmillan. Pre-print available at

  • Salmon Vivian. 1999. Orthography and punctuation 1476–1776. In R. Lass (ed.). The Cambridge history of the English language. Volume III: 1476–1776 13–55. Cambridge: Cambridge University Press.

  • Scragg Donald G. 1974. A history of English spelling. Manchester: Manchester University Press.

  • Sönmez Margaret J.-M. 1993. English spelling in the seventeenth century: A study of the nature of standardisation as seen through the MS and printed versions of the Duke of Newcastle’s A New Method. PhD Thesis University of Durham.

  • Walker Terry and Merja Kytö. 2013. Features of layout and other visual effects in the source manuscripts of An Electronic Text Edition of Depositions 1560–1760 (ETED). In A. Meurman-Solin and J. Tyrkkö (eds.). Principles and practices for the digital editing and annotation of diachronic data (Studies in Variation Contacts and Change in English 14). Helsinki: VARIENG.

Journal information
All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 557 296 16
PDF Downloads 248 150 13