Visit by Dana Roemling

Portrait photo of Dana. A person presenting as a woman with bold black glasses, blonde long hair and facial piercings. They wear a black jumper and smile at the camera.A couple of months ago, I had the amazing opportunity to come to the University of Helsinki as a visiting researcher. This was part of my PhD journey to familiarise myself with machine learning some more. As a forensic linguist, we’re often focussing our work on less automated ways of analysis, so I wanted to explore how I can make use of (semi-)automated processes in the forensic context.

Since my PhD is exploring regional linguistic variation and its usefulness in authorship profiling, I wanted to work with someone who focuses their work in language technology on regional varieties. That is why I got in touch with Yves Scherrer and the Corpus-Based Computational Dialectology (CorCoDial) research group. Having read many of the publications coming out of VarDial, learning from and working with people active in this community was an excellent fit for me

In total, I spent three months in Helsinki, from the end of February to the end of May. This meant that I could join some of the modules being held during that time, which allowed me to learn even more than just through collaboration. Additionally, I was able to join the research seminars held by the Language Technology research group, which meant I had an easy contact point to meet other researchers, but also to introduce my work to the group. Generally, the work culture is excellent and the joint lunch breaks make it easy to get to know people.

The main focus of my visit was on a project within the CorCoDial group. We replicated a study which had come out only a couple of weeks before I arrived. In our experiments, we concentrated on extracting dialect features used by dialect classifiers. This was to evaluate whether dialect classifiers can be explained or interpreted, since, in forensic linguistics, one of the big problems with automated processes is that they are opaque and can’t be used in evidential contexts. We have submitted and presented our work at several conferences, for instance ICLaVE12, and were able to publish a preprint of our work (soon to be published in the NLPAICS proceedings).

Suomenlinna’s melting ice

Although I had some goals in mind for my time in Helsinki, this visit has exceeded them by far. I have learnt a great deal for my PhD, but I also acquired skills I will be able to use beyond the PhD. I have met outstanding researchers and was welcomed with genuine warmth and enthusiasm. I remain grateful that this visit was possible and I had this opportunity, but most of all I am thankful for the friends I have made and collaborations built for the future.

 

Frosty branches
A typical winter day

 

 

Suomenlinna after the snow had gone

 

 

Research visit by Michal Štefánik from Masaryk University in Brno

Michal StefanikWhy coming to Helsinki?

I was familiar with what the Helsinki group does from earlier, as I was building a lot upon their outcomes in my previous work; I’ve used OPUS for evaluating the distributional robustness of translation models, and in a few of my previous papers, I’ve also used Helsinki-NLP models as my base models. Additionally, I’ve had a chance to attend one of prof Tiedemann’s (the group leader) public talks, which left a good impression on me.

How was the arrangement? I wrote a message to prof Tiedemann and later talked to Timothee, one of the lab members, which allowed me to align with the group’s topics of interest. Eventually, I picked the dates from February to the end of April: these fitted well my plans, but also it was the time when the days in Helsinki grew longer but there was still enough snow for winter sports.

What was I working on? There were several interesting directions, but I eventually decided to focus on the topic of modularity in language models, which I’ve also spent most of my time working on.

I was looking into the effects of the modularization of languages in multilingual machine translation models: how modular language models disentangle language-specific features in their representations, and on the practical side, on the impact of modularization on downstream quality of translation in over 100 languages.

One of the rougher morning rides at the end of April 🙂

How was the stay, work-wise? Building on so much previous work, the group has great access to both data and compute, which was really helpful in scaling my experiments to so many languages. There is someone to turn to regarding anything resource-related, which allowed me not to get stuck essentially on anything along the way and kept me focused on my main goals.


Though I think, most importantly, the group has a very friendly culture of open communication about essentially anything. It is perfectly common to ask anyone for help, as well as to ask anyone about what they did over the weekend or if they want to join a beer after work!


How was the stay outside work?
Helsinki, on the verge of spring (February-March), is a great place for anyone who likes the proximity of ever-present nature and winter sports. If the city were a cross-country ski resort, it would likely be the biggest one in Europe, with around 1,000km of maintained trails 🙂I was able to ski around 400km of them in March. On the other hand, people who don’t like snow and winter might be surprised by 30cm of new snow at the end of April 🙂 but then perhaps everyone would enjoy Helsinki’s mild summers, with daylights at least until 9:30 pm from April to October.


Helsinki is also a good starting spot for weekend trips nearby: I recommend taking overnight trains to Lapland for aurora and wild nature or a 2-hour ferry ride to Tallinn with not-too-terrible historical sights!

Trip to Tallinn is only 2-hour ferry ride