When: 23-24 November 2023
Where: University of Helsinki, Minerva Plaza, Siltavuorenpenger 5 A, Helsinki, Finland
The detection of similar passages of text in a large corpus, often called text reuse, is gaining popularity as a methodological step in Digital Humanities research. Intertextual relationships have been an object of interest for various humanities disciplines for a long time, but their discovery with traditional methods is usually very labour-intensive. Automatic text reuse detection provides results that are easy to interpret and can readily lead to new insights especially for less studied corpora.
A lot of research has been done on the text reuse detection methods, but the understanding of the results and their application in answering research questions often seems very context-specific. Furthermore, a most typical application of text reuse is as a discovery method to find examples, on which a qualitative argument can be built. There seems to be a need for a more developed theory about the phenomenon of text similarity and insights that can be gained from a large-scale quantitative analysis of it.
The purpose of this workshop is to bring together researchers from various DH projects that have employed text reuse, to answer the leading question: Is text reuse more than a discovery method? Can we progress towards a more formal theory of what kinds of patterns can be found in text reuse data and how to interpret them?
Proposed topics to discuss include but are not limited to:
- Measures to quantify text similarity and their distribution
- Kinds and degrees of similarity (verbatim copy, paraphrase, translation, oral transmission etc.) and their quantitative characteristics
- Quantitative methods to analyse the spreading of texts in time and space
- Measuring the influence and reception of a text
- Network analysis of graphs of text similarity
- Bridging quantitative and qualitative analysis
The workshop is organised by Maciej Janicki (University of Helsinki, FILTER project) and Mikko Tolonen (University of Helsinki, head of the Computational History group, HPC-HD project) in collaboration with Dariah-FI and the project Informationsflöden över Östersjön funded by SLS.