The difficult relationship between Brexit and Twitter data

During the Digital Humanities Hackathon our group is researching data collected from Twitter. Our data consists of all the tweets, retweets and mentions connected to the hashtag #Brexit from time period starting from 26 February to 15 April 2019.  

The timeline for our data is interesting since the 29 March was the original date United Kingdom was supposed to leave European Union, but as we all know that did not happen. This means we have a possibility to look in detail something that failed to happen and how people reacted to it on this particular social media platform.

One important thing to remember when doing research with data like this is that you need to know your data, how it was collected and what are its limitations. Twitter represents only one social media among others – not everyone uses it.

We only have people from certain demographics of general population and even the Twitter users who participate in the political discussion on Twitter and who knows if they share their true feelings as openly as they would in “real life” on social media.

Problem with using singular hashtag as a parameter for collecting data is that hashtags are always self-selected markers and not all Twitter users add them in their tweets. In our case there could also be other hashtags connected to Brexit as a phenomenon that are used individually without the addition of #Brexit. This sets certain limitations to interpreting our data.

Digital humanities methods have great advantages for doing fast and big research, but there is still something the computer cannot achieve. For example, yesterday Heng and Faiz were identifying the gender of 577 MP members manually because there wasn’t a good program that could detect the gender accurately by the twitter username.

Besides, the bigger the data is, the messier the data could be. It is easy to obtain big size of data, but the data could also have low reliability without close manual checking. We for example, have to vary of bot activity relating to political discussion. Bigger and faster is not always better, and we need to find a balance between quantity and quality when doing digital humanities.

With data coming from social media there are also ethical questions to be solved and thought about. Because Twitter data contains so much personal information from users, sharing the original datasets with your research would be highly unethical. What kind of information can we use, what is too personal? This means that anonymisation of usernames, tweets and other critical information is extremely important and a lot of time needs to be dedicated to these practices.

Written by Minna Turunen MA student of the Cultural Heritage program at the University of Helsinki and Heng Gong MA student of English studies at the University of Helsinki.

Brainstorming in an interdisciplinary environment a.k.a. how to write on a wall

First we were given the magic nametags which provide us the endless coffee at the cafeteria. Digital Humanities Hackathon 2019 has started. Then we were put in a small room full of strangers. Those strangers would not be strangers for long for we would study Brexit on transnational social media for the next couple of weeks.

Turns out we come from all over the world; China, India, US, Finland, Russia. We come from different educational backgrounds and most of us know nothing of studying social media. But we fear not – we have the support of our group leaders Daria, Joe and Steven, the data team and the general organizers.

But what is digital humanities? It is a field of studies that combines humanities and social sciences with computational methods to solve issues that interest humanities and social sciences researchers. The possibilities are endless. Not only can humanists now study larger datasets than before, the computer scientists can tackle humanities interests. We can study the digital world or use new methods to study history. On the other hand we all need to learn to work together.

Haven’t you always wanted to draw the wall full of mind maps? Well, we got to do that!

What is interesting about the Brexit phenomenon on Twitter during the spring of 2019? The due date of the UK’s separation from EU came and went and was postponed. Each of us took a stand one at the time and wrote on the wall what would be interesting to study about this phenomenon. Some of us were interested in applying specific methods to the data, some of us were interested in abstract contexts related to Brexit.

The brainstorming session was full of forgotten words, new concepts and methods explained. It was also full of new connections between subjects that none of us could see on our own. The session also caused a lot of anxiety, as the amount of new information was overwhelming. Taking a small break seemed to help to focus, as did voicing concerns and sharing the feelings we had.

Dealing with social media data raises a lot of ethical questions. Where do we draw the line what information has been willingly given to be seen and used? How do we anonymize the information we are using?

Deciding on one approach deemed hard. We all had so many ideas on what could be interesting in the data that making tough decisions on what to do seemed impossible. But once we all felt comfortable enough to voice their opinions we are making our way towards an actual research.

Next up – the research plan. Follow our journey in this blog and on Twitter with the hashtags #dhh19 & #brexit.

Written by Sonja Sipponen, MA student of general history at the University of Helsinki.