Letters looted by pirates offer invaluable knowledge on everyday language

Asylum seekers from Syria have prompted questions about integration recently, but historical sociolinguistics shows migration is not a new phenomenon and integration is possible.

Linguistic and ethnohistoric data indicate migration has played a large role in much of Europe’s history. Dr. Marijke van der Wal from Leiden University has been studying the effects of migration through language. Looking at these migrants’ language, we can get a glimpse of how they adjusted to their new homes.

It was an honour to have Dr. Marijke van der Wal open the HiSoN Conference 2016 in Helsinki.

It was an honour to have Dr. Marijke van der Wal open the HiSoN Conference 2016 in Helsinki.

Letters as Loot: where pirates save the day

“This time machine­ like collection of letters has value to discover more about migration patterns across Europe” – Dr. Marijke van der Wal.

During the Anglo‐Dutch Wars, English and Dutch pirates seized ships of enemy countries with government support. In order to prove these seizures as legal seizures of enemy ships, correspondence documents transported on the ships were provided as evidence. These letters are to this day held at the British Archives in England.

As most survived historical documents are from men of the upper class or clergy, these “letters as loot” provide invaluable knowledge on the everyday language of not just the upper but also the middle and lower classes of society. These letters, often correspondences between family members, also offer a peek into the lives of women and children.

Family letters from refugee merchants

Dr. van der Wal explains how authors were identified with the help of the Amsterdam marriage register

Dr. van der Wal explains how authors were identified with the help of the Amsterdam marriage register

With so much movement of people from different countries, bilingualism, and even multilingualism, was a normal state of affairs, as it still is today.

Van der Wal presented the case of the merchant family Heusch who immigrated to Hamburg, Germany from the Netherlands in the 17th century to evade war and religious persecution. There, they became an important part of the German trade scene. Such successful integration does not mean complete assimilation. They still wrote to each other in their native language of Dutch with few influences of German present in their written language. Even second generation Heusch’s born and raised in Germany wrote in Dutch.

Turning back to the present day situation of migrant integration, it is then possible to envision a society where migrants become contributing members of society without the need to eliminate their native language and culture.

Population boomed in Amsterdam

Amsterdam’s population grew from 30,000 to 200,000 between 1585 and 1660 purely from an influx of migrant workers. Many of these migrants arrived in the Netherlands looking for better employment opportunities. Many locals were also concerned that the excellent system of poor relief in Amsterdam would attract more migrants. Does this sound familiar?

These migrants, however, were not leeches of the system but became fully integrated members of society. They often set up family in the Netherlands. Van der Wal suggests that one way of integrating is marriage, because “marrying a local woman was a fruitful strategy for integration and economic success.” Van der Wal has been surprised to find that some immigrants integrated into the Dutch society to the extent that they wrote letters in Dutch even to their non‐Dutch relatives.

The HiSoN (Historical Sociolinguistics network) conference was held at the University of Helsinki, Finland from 10–11 March, 2016. Particular emphasis was placed on the social aspect of historical linguistics this year. Over 50 sociolinguists from 17 countries participated in this year’s conference, with over 20 languages represented in the papers. Dr. Marijke van der Wal opened this event with her plenary talk.

Text: Tina Lin, Anna Suutarla, Iida Hinkkanen. Photos: Anna Suutarla.

Read more

Another blog post from the HiSoN conference: ‘More Trump Than Is Healthy’ – How Political Speeches Have Changed

‘More Trump Than Is Healthy’ – How Political Speeches Have Changed

Dr. Jukka Tyrkkö has been researching the sociolinguistic aspects of political discourse. Up until recently everything seemed to be going pleasantly, but then along came Trump, and skewed all the data.

Tyrkkö’s closing plenary of the HiSoN conference dealt with two corpora. One of them is a small corpus of speeches delivered between 1800 and 2015. Taking a macro approach to language data, this research focuses on the trends and tendencies of features such as sentence length, word length and readability score. The other one is a political Twitter corpus gathered from tweets of the 2016 US presidential primaries.

From the political speech to the political tweet

In the beginning of his talk, Tyrkkö warned the audience that they would get “more Trump than is healthy”.

In the beginning of his talk, Tyrkkö warned the audience that they would get “more Trump than is healthy”.

The nature of political speeches has changed dramatically over time. Technological advancements, like the invention of the radio and television, and more recently the internet and social media, play a major role in these changes.

Whereas political speeches held in the beginning of the 20th century used to consist of long and complex sentences, in today’s social media oriented world political messages have to be short and easy to grasp.

Dr. Tyrkkö explains that the never ending election cycle pushes politicians to focus their message on winning the popular vote and leaves little room for “boring” fact-based politics that requires in-depth understanding of issues.

Add to this a politically passive audience that experience information overload as it is, and you get ideology based argumentation that dilutes complex political issues into an us vs. them set-up and thrives on hyperbole – the art of exaggeration. Everything is the best or at the very least just plain great. Can you already see how Trump’s slogan “Make America Great Again” fits in with all of this?

Jukka Tyrkkö claims that political messages can now be incomplete ramblings as long as there is something that grabs the listener’s attention.

A different kind of candidate

Donald Trump is a businessman and currently a candidate for the presidential nomination of the Republican Party in the United States.

Over the last 10 years there has been a shift from rational to ideological leadership, and as can be seen from Trump’s election speeches, he successfully exploits this fact. His electorate speeches are often inconsistent ramblings that go off topic and exaggerate, pleading to the emotions of his audience rather than presenting factual arguments.

Worldcloud compiled by Jukka Tyrkkö based on Trump's most recent tweets. The size of the font in the wordcloud increases with the frequency of the word in the tweets.

Worldcloud compiled by Jukka Tyrkkö based on Trump’s most recent tweets. The size of the font in the wordcloud increases with the frequency of the word in the tweets.

Coming back to how Trump has skewed the data of political speeches, it’s useful to consider how his speeches differ from his fellow politicians. Whereas the average word length of politicians of present-day English is five characters in length, the same number for Trump is 3.85. Another characteristic of Trump’s discourse is that he loves to talk about himself and address his audience directly with his top five words being: I, going, you, they, and of course, Trump.

New world, modern tactics

Judging from the tweets of modern day politicians, it might seem like current politicians are simply not intelligent and rational individuals.

Dr. Tyrkkö claims however, that it is more likely that “politicians and their teams are deliberately dumbing down election speeches”. Modern politicians are not more simple or stupid than their predecessors. They have only adapted their tactics to be in line with the rules dictated by the new media.

After listening to the plenary, we fear that if we do not require fact-based argumentation and rational decision-making from our politicians, we may end up with a rabble-rouser for president.

Tyrkkö’s research article on the subject, including full references and detailed statistical data will be published in the near future.

The HiSoN (Historical Sociolinguistics network) conference was held at the University of Helsinki, Finland from 10–11 March, 2016. Particular emphasis was placed on the social aspect of historical linguistics this year. Over 50 sociolinguists from 17 countries participated in this year’s conference, with over 20 languages represented in the papers.

Text: Iida Hinkkanen, Tina Lin, Anna Suutarla. Photos: Anna Suutarla.

Read more

Another blog post from the HiSoN conference: Letters looted by pirates offer invaluable knowledge on everyday language

The Criminalized Poor and the Rise of Mega-Corpora

The poor have been degraded and criminalized in texts of all genres throughout the ages through denigrating adjectives and descriptives. Professor Tony McEnery from Lancaster University provides us with a unique and chilling view on how linguistic changes portray the changing treatment of the poor during the 17th century.

Tony McEnery. Photo: Tanja Säily.

Tony McEnery. Photo: Tanja Säily.

Criminalizing the Poor in the 17th Century

In October 19-22, the Research Unit for Variation, Contacts and Change in English (VARIENG) organized the D2E – From data to evidence conference, gathering scholars from the fields of English language studies to discuss how big data, rich data, and uncharted data can affect, enhance, or hinder linguistic research.

The first plenary of the conference was given by Professor Tony McEnery, speaking on the use of corpora in socioeconomic studies of the treatment of the poor through a linguistic lens. Professor McEnery’s speech provides a fascinated view on how large corpora can be used to provide a sociolinguistic approach both on the use of derogative terminology and the changes that happen decade by decade.

McEnery’s team explored over a billion words of writing from the 17th century through the Early English Books Online (EEBO) corpus, which includes nearly every piece of literature printed in the UK, Ireland and British North America from the 15th to the 18th century. The team identified the most common words used to identify the poor, examining their use in the texts to uncover patterns of meaning denoting the linguistic socio-economical treatment of the poor during the 17th century.

McEnery provides an enticing case by examining the evolution of terms such as rogue, beggar, vagrant, and vagabond in the 17th century, as well as the language associated with the words. The study paints an interesting image of how literary and religious texts treated the terms and which modifiers were used with them.

Beggar, for example, was typically modified by adjectives denoting the understanding of their poverty, such as poor, needy and miserable. There are few negative denotations in the beginning of the 17th century – until this changed sharply during the second decade, when the word sturdy became to be commonly attached to beggar to portray them as able-bodied people who choose to not work but beg. Likewise, drunkenness starts to be attached to them in the 1620s. Similarly, vagabonds were associated with negative modifiers or close-proximity words such as vile, loose, and whore, and rogues with close association to cheating, lying, and villain.

McEnery’s results demonstrate the prejudices felt against the poor in our societies across the ages, providing us with an unsettlingly clear map on the frequency, dispersion, and connectivity of negative vocabulary used to create the negative semiotics of poverty, mapping the changes decade by decade through the 17th century.

Mega-corpora methodologies

In recent years, the evolution of corpora have provided scholars with unprecedented access to texts in vaster amounts of text masses than ever before. The rise of mega-corpora has been both a curse and a blessing – their vast sizes have meant the ability to routinely utilize texts on scales never seen before, but their inclusiveness has brought additional challenges with contextualization. When the boundaries of a corpus are not clearly mapped, or when a mega-corpus spans across the boundaries of various well contextualized corpora, extra effort is needed in maintaining the representability of the data.

According to McEnery, the methods of the study deserve even more scrutiny than the case results themselves: with his unique perspective in the field, McEnery presents singularly convincing and well-thought insights on working with mega-corpora. Even though the results would have been worth many more plenaries, the essential core philosophy of McEnery’s methodology is both sobering and enthralling. “A corpus is more than a load of text. It needs linguistic tools to strip the essential parts of data.” Linguistic context is everything.

McEnery emphasizes that mega-corpora provide unique opportunities for synergy between the study of history and linguistics. While historians can help linguists by pointing out cultural contexts and frames, linguists can provide them with much needed linguistic context: whether a concept or change in semantics is relevant, and what it could mean within the linguistic frame.

While the vast sizes of modern corpora may sway some researchers to rely on statistics and correlations, McEnery disagreed heavily with such methods. The most important factors in language are change and context, as he emphasizes: “Dynamism is the key…Close reading is the key. No mathematics will tell you what is happening there.” Meaning is never stable, and there is only so much word frequency and connectivity will tell you without a deep reading of the selected text segments.

Methods and results go hand in hand

McEnery’s main warning to academics is in failed contextualization: not necessarily contextualizing the research texts and results, but in contextualizing the research texts with the 20th century methodology used to study them: as our – and our societies’ – conceptual maps change, we must closely examine our own conceptual frames when doing research, as our concepts of meaning may be fundamentally wrong in reading historical texts.

McEnery manages a rare feat: touching both methodological and socio-linguistic issues and bringing up important aspects in both. How we can and should use mega-corpora – and topically, how the ways the poor were criminalized through linguistic means in the 17th century echoes chillingly in our own time.

Text: Mika Loponen

Read more:

Don’t save English, save the dying languages

England may have lost a mediocre cricket player in Peter Trudgill, but the world of linguistics gained a living legend.

Students of the Language Change Database Project course interviewed the noted sociolinguist and his wife, Jean Hannah, over coffee a day before his guest lecture at the Metsätalo building in Helsinki on March 18th. Their questions revolved around Trudgill’s experiences as a student at Cambridge and Edinburgh, his career choices, and the future of linguistics and academia.

In his youth Trudgill did indeed dream of playing cricket for England but ultimately his career in academia progressed quite organically. In hindsight, he said he could not “imagine anything better” and has enjoyed his time at several universities, including the University of Lausanne, Switzerland (home to “the best coffee served at a university”) and the University of Agder, Norway, where he is currently tenured.

Peter Trudgill and his wife Jean Hannah

Peter Trudgill and his wife Jean Hannah

The past is now

Unfortunately, a long career like Trudgill’s might be harder to achieve today than in the 1960s. In his view, one of the biggest challenges facing the modern academic world is zealous business thinking, which hurts the fields that are considered “unprofitable”, such as the Humanities.

Putting business first might prove shortsighted, as resources are needed now if the world wants to document languages that are under the threat of extinction. Studying those languages would shed light on the ways prehistoric languages have developed throughout human history.

Trudgill giving lecture

Trudgill giving lecture (on the left), Professor Terttu Nevalainen (on the right)

The origins and evolution of human languages still remain obscured by lack of data.

As linguists we can only try to hypothesise how prehistoric languages may have functioned and the prevalent way to do that is through the Uniformitarian hypothesis, meaning that by observing languages in the present, we can understand what human languages must have been like in the past.

However, in his lecture titled “The Uniformitarian Hypothesis and Prehistoric Linguistics” Trudgill urged researchers to exercise caution when using the hypothesis to make generalisations. He stressed that linguists need to be mindful of chronological and geographical bias and not build models based solely on languages spoken in modern societies, which are highly atypical in the broader history of humankind. Living in “societies of strangers” where the vast majority of the people do not know each other is a very recent development in human history.

In the absence of time machines

Trudgill does not mean to say that observing modern languages is fruitless or that we should abandon the Uniformitarian hypothesis.

While we and our prehistoric ancestors share largely the same physiology and language faculties, we have no way of gathering any hard data on the prehistoric languages themselves (short of inventing a time machine). Thus we must hypothesise based on the workings of modern languages.

Effectively, Trudgill argues that more attention should be paid to the small, endangered and “remote” languages that are still spoken in a context that strongly resembles what prehistoric societies were like: small and tightly-knit groups of people who all know each other, or “societies of intimates”.

Indeed, research on such languages has yielded surprising nuggets of insight into what our prehistoric ancestors’ languages might have been like. Small communities give rise to language features that seem atypical and exotic if looked at from an Indo-European context, yet such features will most likely have been significantly more commonplace in prehistoric times.

Examples of such features abound in small languages.

One example given by Trudgill is Onya Darat, a language spoken on the island of Borneo, whose system of personal pronouns shows generational affiliation. In other words, their personal pronouns signify whether the addressed person belongs to the same or younger or older generation as the speaker.

Such a feature can only appear in a society where people know each other and are aware of everyone’s ancestry. Thus we can surmise that such complex structures linked to non-anonymity may have existed in prehistoric languages, even if they are exceedingly rare in modern ones.

Audience of Trudgill's lecture on March 18th

Trudgill’s lecture on March 18th

Trudgill likewise asserted during the coffee meetup that more linguists should be engaged in documenting small languages around the world, as they are verging on extinction in this era of globalisation and interconnectedness. Now is our last chance to record many of them for posterity.

Lamenting that he himself had not done more fieldwork in his career, Trudgill half-jokingly encouraged younger linguists to “forget about English” in favour of focusing on conserving these endangered languages. Hearing a Professor of English make such claims might sound strange, but it only illustrates the urgency of such conservationist efforts.


Text: Sofia Bergman and Toni Matikainen

Pictures: Saana Kallioinen and Ina Liukkonen

Interview questions, comments and proofreading: Sanna van Erk-Koivisto, Ida Mauko, Antti Siitonen and Ari Slioor

18 March Peter Trudgill: Sociolinguistic Typology and the Uniformitarian Hypothesis

Professor Peter Trudgill will visit the Department of Modern Languages and give a guest lecture entitled “Sociolinguistic Typology and the Uniformitarian Hypothesis”.

Date: Wednesday, 18th March, 14-16
Venue: Metsätalo auditorium 1 (ground floor)

Prof. Trudgill’s visit is hosted by the Academy project Reassessing Language Change at VARIENG.

Everyone is welcome to attend.


One of the fundamental bases of modern historical linguistics is the uniformitarian principle. This principle states that knowledge of processes that operated in the past can be inferred by observing ongoing processes in the present. In this paper I present a sociolinguistic-typological perspective on this issue, where by “sociolinguistic typology” I mean a form of linguistic typology which is sociolinguistically informed and which investigates the extent to which it is possible to produce sociolinguistic explanations for why a particular language variety is like it is.

This work is based on the assumption that there is a possibility that certain aspects of social structure may be capable of having an influence on certain aspects of language structure. I argue that, insofar as the characteristics of individual human languages are due to the nature of the human language faculty, there cannot be any questioning of the uniformitarian principle. We have to assume that the nature of the human language faculty is the same the world over, and that it has been like that ever since humans became fully human. But what about if some of the characteristics of individual human languages are due to social factors?