Spelling standardization improves research results

The Language and Identity project has reached one of its goals – the original  spelling of early English letters in the Corpus of Early English Correspondence has been normalized.

Normalizing or standardizing idiosyncratic spelling improves the accuracy of automatic language analyses. We tested the impact of normalization on 17th-century correspondence data and reported on the results in the ICAME conference in Oslo in June 2011 and in the Helsinki Corpus Festival in Helsinki in September 2011:

http://www.helsinki.fi/varieng/CoRD/corpora/CEEC/standardized.html

This entry was posted in Project news. Bookmark the permalink.