The Language and Identity project has reached one of its goals – the original spelling of early English letters in the Corpus of Early English Correspondence has been normalized.
Normalizing or standardizing idiosyncratic spelling improves the accuracy of automatic language analyses. We tested the impact of normalization on 17th-century correspondence data and reported on the results in the ICAME conference in Oslo in June 2011 and in the Helsinki Corpus Festival in Helsinki in September 2011:
http://www.helsinki.fi/varieng/CoRD/corpora/CEEC/standardized.html