corpus | Nykykielten laitos

The poor have been degraded and criminalized in texts of all genres throughout the ages through denigrating adjectives and descriptives. Professor Tony McEnery from Lancaster University provides us with a unique and chilling view on how linguistic changes portray the changing treatment of the poor during the 17^th century.

Tony McEnery. Photo: Tanja Säily.

Criminalizing the Poor in the 17^th Century

In October 19-22, the Research Unit for Variation, Contacts and Change in English (VARIENG) organized the D2E – From data to evidence conference, gathering scholars from the fields of English language studies to discuss how big data, rich data, and uncharted data can affect, enhance, or hinder linguistic research.

The first plenary of the conference was given by Professor Tony McEnery, speaking on the use of corpora in socioeconomic studies of the treatment of the poor through a linguistic lens. Professor McEnery’s speech provides a fascinated view on how large corpora can be used to provide a sociolinguistic approach both on the use of derogative terminology and the changes that happen decade by decade.

McEnery’s team explored over a billion words of writing from the 17^th century through the Early English Books Online (EEBO) corpus, which includes nearly every piece of literature printed in the UK, Ireland and British North America from the 15^th to the 18^th century. The team identified the most common words used to identify the poor, examining their use in the texts to uncover patterns of meaning denoting the linguistic socio-economical treatment of the poor during the 17^th century.

McEnery provides an enticing case by examining the evolution of terms such as rogue, beggar, vagrant, and vagabond in the 17^th century, as well as the language associated with the words. The study paints an interesting image of how literary and religious texts treated the terms and which modifiers were used with them.

Beggar, for example, was typically modified by adjectives denoting the understanding of their poverty, such as poor, needy and miserable. There are few negative denotations in the beginning of the 17^th century – until this changed sharply during the second decade, when the word sturdy became to be commonly attached to beggar to portray them as able-bodied people who choose to not work but beg. Likewise, drunkenness starts to be attached to them in the 1620s. Similarly, vagabonds were associated with negative modifiers or close-proximity words such as vile, loose, and whore, and rogues with close association to cheating, lying, and villain.

McEnery’s results demonstrate the prejudices felt against the poor in our societies across the ages, providing us with an unsettlingly clear map on the frequency, dispersion, and connectivity of negative vocabulary used to create the negative semiotics of poverty, mapping the changes decade by decade through the 17^th century.

Mega-corpora methodologies

In recent years, the evolution of corpora have provided scholars with unprecedented access to texts in vaster amounts of text masses than ever before. The rise of mega-corpora has been both a curse and a blessing – their vast sizes have meant the ability to routinely utilize texts on scales never seen before, but their inclusiveness has brought additional challenges with contextualization. When the boundaries of a corpus are not clearly mapped, or when a mega-corpus spans across the boundaries of various well contextualized corpora, extra effort is needed in maintaining the representability of the data.

According to McEnery, the methods of the study deserve even more scrutiny than the case results themselves: with his unique perspective in the field, McEnery presents singularly convincing and well-thought insights on working with mega-corpora. Even though the results would have been worth many more plenaries, the essential core philosophy of McEnery’s methodology is both sobering and enthralling. “A corpus is more than a load of text. It needs linguistic tools to strip the essential parts of data.” Linguistic context is everything.

McEnery emphasizes that mega-corpora provide unique opportunities for synergy between the study of history and linguistics. While historians can help linguists by pointing out cultural contexts and frames, linguists can provide them with much needed linguistic context: whether a concept or change in semantics is relevant, and what it could mean within the linguistic frame.

While the vast sizes of modern corpora may sway some researchers to rely on statistics and correlations, McEnery disagreed heavily with such methods. The most important factors in language are change and context, as he emphasizes: “Dynamism is the key…Close reading is the key. No mathematics will tell you what is happening there.” Meaning is never stable, and there is only so much word frequency and connectivity will tell you without a deep reading of the selected text segments.

Methods and results go hand in hand

McEnery’s main warning to academics is in failed contextualization: not necessarily contextualizing the research texts and results, but in contextualizing the research texts with the 20^th century methodology used to study them: as our – and our societies’ – conceptual maps change, we must closely examine our own conceptual frames when doing research, as our concepts of meaning may be fundamentally wrong in reading historical texts.

McEnery manages a rare feat: touching both methodological and socio-linguistic issues and bringing up important aspects in both. How we can and should use mega-corpora – and topically, how the ways the poor were criminalized through linguistic means in the 17^th century echoes chillingly in our own time.

Text: Mika Loponen

Read more:

Roman Leibov, Docent of Russian literature in the University of Tarto, will give three guest lectures at the Department of Modern Languages. The topic of the lectures is digital humanities – resources, methods, and tendencies.

When: 25 September 4-6 pm, 26 September at 2-4 pm and 4-6 pm

Where: lecture hall 22, Metsätalo, Unioninkatu 40

Please note that the language of the lectures is Russian.

***

A brief interview with Roman Leibov:

What are your lectures at the University of Helsinki about?

Digital Humanities as broad field of technique/knowledge, its parts and sub-fields (digital libraries, archives, data curation, distant reading etc.).

What makes this field especially interesting for you? What are the current challenges and opportunities in the field of digital humanities?

It’s obvious now than traditional humanities are transforming by a new cultural situation. The translation of traditional forms of philological knowledge into these new frames is a duty of the older generation. Moreover, I’m really interested in “distant reading”, in quantitative methods in the history of literature, and in the field that may be called “corpus poetics”.

Nykykielten laitos

Ajankohtaista asiaa kielistä ja kulttuureista

Tag Archives: corpus

The Criminalized Poor and the Rise of Mega-Corpora

Criminalizing the Poor in the 17^th Century

Mega-corpora methodologies

Methods and results go hand in hand

Guest lectures by Roman Leibov on 25-26 September

Criminalizing the Poor in the 17th Century

Mega-corpora methodologies

Methods and results go hand in hand

Criminalizing the Poor in the 17^th Century