Webinar, 5 July 2019
Tanja Säily and Eetu Mäkelä are hosting a live session on “The OED and historical text collections: discovering new words” in the Oxford English Dictionary webinar series. More information.
Data wrangling seminar, 25 April 2019
Common to the three invited DIGIHUM projects is the desire to use messy, nonstandard data as an imperfect proxy or lens through which to study complex human phenomena. Common is also the desire to combine structured and unstructured data, faceting phenomena identified in noisy (OCR error/internet-speak ridden) textual data against categorical metadata dealing with social, temporal and geographical aspects.
Thus, while targeting vastly different materials and questions, from a methodological viewpoint the projects have had to deal with very similar issues. This seminar aimed to highlight these commonalities and learn from how each project has dealt with them. Therefore, the seminar was oriented around each project discussing in concrete terms the workflows they have employed to wrangle information out of their data.
Three thematic viewpoints were discussed: 1) concerns over data quality and representativeness, 2) workflows for unifying and cleaning up data for processing and 3) turning data into interpretations.
Visit to Oxford University Press, 26 February 2019
Tanja Säily, Eetu Mäkelä, Terttu Nevalainen and Samuli Kaislaniemi had a meeting with staff members of the Oxford English Dictionary, discussing our current and future research using the OED. We also gave a talk on NATAS in the OED lecture series.
HSYRF, 2–3 March 2018
Terttu Nevalainen gave the Bad Data keynote lecture and Tanja Säily taught a workshop on “Data visualization in historical sociolinguistics” at the Historical Sociolinguistics Young Researchers Forum in Brussels, 2–3 March 2018.
STRATAS project course, autumn 2016
This course provided students with the opportunity to become better acquainted with the relevant literature on historical sociolinguistics, together with hands-on experience of applying this theoretical knowledge in practice, by working as part of an academic research project. In addition to taking part in the everyday activities of the project, students participated in the creation of a database recording the reliability of printed collections of English 18th-century letters, transcribed Finnish 19th-century letters from manuscript, tested linguistic software tools developed in the project, and helped design this website for the project.