Tommi Nieminen joins Helsinki-NLP

My name is Tommi Nieminen, and I recently joined the Helsinki-NLP research group as a new PhD student. For the past two decades, I have worked in the translation industry, starting as a translator and gradually drifting to more technical roles, such as CAT tool support, localization engineering, translation process automation, and machine translation development. Due to my work history, my research focuses mostly on the use of language technology in professional translation.

I have a long history with the University of Helsinki. I enrolled on an MA Philosophy course in the university in 2001, and after a long period of academic absence and part-time study (and a change of disciplines) I finally graduated with an MA in language technology in 2018. Since then I have participated in two academic projects involving the university, Fiskmö and OPUS-MT: Open Translation Models, Tools and Services. In the course of these projects I developed the OPUS-CAT tool, which enables translators to use machine translation models from the OPUS-MT project in their normal working environments. The motivation behind OPUS-CAT is to make open-source machine translation technology and resources directly available to the individual translators, so that they may have more control over how machine translation is integrated into their profession.

I am thrilled to be part of the team in Helsinki and the GreenNLP project, and to work on issues that have recently become more significant than ever before. I live far from Helsinki, so I am usually at the university only one day a week. I look forward to meeting all of you that I have not met yet.

Introducing Elaine Zosa

profile-fotoHello there! I’m Elaine, a new postdoctoral researcher in the HelsinkiNLP Research Group. To start off, I have not always worked in NLP. I worked in the financial technology sector before I decided to study for a master’s degree. I obtained my MSc in Computer Science at the University of Helsinki where my concentration was on algorithmic bioinformatics. After that, I was a research assistant in computational genomics at the Technical University of Munich. Then in late 2018, I started my doctoral research at the University of Helsinki, in the Discovery Research Group led by Prof. Hannu Toivonen.

During my PhD, I worked on two EU Horizon 2020 projects: NewsEye (https://www.newseye.eu/) and EMBEDDIA (http://embeddia.eu/).  Both these projects involved building tools to help analyse large-scale news collections. In the former, we focused on historical news collections from Finland, France, and Austria, and in the latter, on news media from less-represented European languages such as Finnish, Estonian, and Croatian. I worked on various tasks in the projects and helped develop new methods in topic modeling, lexical semantic change, news headline generation, and multilingual news matching. Methodological innovations aside, these projects exposed me to the inherently interdisciplinary nature of NLP and language technology and that, I think, is the most exciting thing about this field. I enjoy building tools that could be useful to researchers in the humanities and social sciences, and beyond.

Now I am investigating methods to quantify and model uncertainty in various linguistic tasks. You can also find out more about my work on my homepage, https://ezosa.github.io/!

New team member: Shaoxiong Ji

Hello,

My name is Shaoxiong Ji. This year, I started as a Postdoctoral Researcher at the University of Helsinki, working on high-performance language technology at the Language Technology research group led by Prof. Jörg Tiedemann.

My research focuses on multilingual NLP and machine translation. I will be working with some NLP resource development such as large-scale data and and train big models and knowledge distillation models with the HPLT project. I am also interested in some other topics such as modular neural networks, zero-shot cross-lingual tasks, and other low-resource problems.

I did my Ph.D. at Aalto University under the supervision of Assoc. Prof. Pekka Marttinen in the Machine Learning for Healthcare research group.  My Ph.D. thesis is about natural language processing for healthcare applications. During my doctoral candidature, I was a visiting researcher with Prof. Hinrich Schütze at the University of Munich (LMU Munich, Germany) and Dr. Mikko Peltola at the Finnish Institute for Health and Welfare (THL, Finland). Prior to my doctoral candidature, I did MPhil research with Prof. Xue Li and Prof. Helen Zi Huang at the University of Queensland (UQ Australia) working on NLP applications for social good. I did visiting research with Assoc. Prof. Erik Cambria at Nanyang Technological University (NTU Singapore), where I worked on sentiment analysis especially emotion recognition in conversations.

I also spent a half year as a research assistant and a visiting scholar with Dr. Guodong Long and Dr. Shirui Pan at the University of Technology Sydney (UTS Australia), working on federated learning and mobile internet applications.

I am happy to join the group and discuss more about interesting topics. Thank you for your time and for reading my introduction and looking forward to meeting you!

Best,
Shaoxiong

New team member: Ona de Gibert Bonet


Hello everyone!

My name is Ona de Gibert Bonet and I am thrilled to introduce myself as a new PhD student in the Department of Digital Humanities at the University of Helsinki. I joined the Helsinki-NLP group in January this year.

To tell you a bit about my background, I was born and raised in Barcelona. I received a B.A. in Modern Languages and Literature from the University of Barcelona (2016) and earned a M.S. in Language Analysis and Processing from the University of the Basque Country (2018).

I am passionate about Machine Translation (MT), which is what led me to pursue a PhD in Language Technology. I am excited to join the research team led by Jörg Tiedemann, where I will be exploring MT for low-resourced languages with a focus on knowledge distillation. I believe that this research area has great potential to make a positive impact on the democratization of MT, broadening its accessibility, and I am eager to contribute to this work.

Apart from my academic interests, I love dancing ballet and lindy hop. In my free time, you can often find me doing very Finnish things: knitting or in the sauna. Both activities help me relax and clear my mind after a long day of research.

Thank you for taking the time to read my presentation, and I cannot wait to see what this academic journey has in store for us all. If you see me around campus, please don’t hesitate to say hello!

Best,
Ona

New project accepted: Green NLP

The Academy of Finland decided to fund our project proposal on “Green NLP – controlling the carbon footprint in sustainable language technology” from the call on sustainable and energy-efficient ICT solutions. We are looking forward to three years of exciting research and work together with our colleagues from TurkuNLP and CSC.

GreenNLP addresses the problem of increasing energy consumption caused by modern solutions in natural language processing (NLP). Neural language models and machine translation require heavy computations to train and their size is constantly growing, which makes them expensive to deploy and run. In our project we will reduce the training costs and model sizes by clever optimizations of the underlying machine learning algorithms with techniques that make use of knowledge transfer and compression. Furthermore, we will focus on multilingual solutions that can serve many languages in a single model reducing the number of actively running systems. Finally, we will also openly document and freely distribute all our results to enable efficient reuse of ready-made components to further decrease the carbon footprint of modern language technology.

Meet the LT industry 2022

  • Place: Kielikeskus (Fabianinkatu 26), Juhlasali
  • Date: Friday November 25, 2022
  • Time: 15:15 – 17:45

Update 28 November: Thanks for attending! You can find the presentation slides here (UH account required).

The purpose of this event is to arrange a meeting between students and representatives of the industry that work with language technology in one way or another. The event is open to anyone who is interested in getting information about career opportunities. We will have short presentations of relevant companies and their business and leave time for questions and discussions. There will also be the opportunity to informally speak to the industry representatives face to face.

We have invited various language service providers and LT businesses and the preliminary list of confirmed participants is listed below:

  • Kielikone (Elina Söderblom)
  • Lingsoft (Sebastian Andersson)
  • Semantix (Teemu Tenhunen)
  • Utopia Analytics (Saara Palma-Suominen, Sami Virpioja)
  • Sanoma Media Finland (Clemens Westrup)
  • Front.AI (Tiila Käenniemi)
  • Huawei (Adrian Flanagan)

Please sign up here by Sunday 20 November if you intend to participate. (The registration is not binding, it is just to facilitate the organization.)

2022 Steven Krauwer Award for OPUS-MT for Ukrainian

Helsinki-NLP received the 2022 Steven Krauwer award for CLARIN achievements for the work on open machine translation for Ukrainian. Thank you very much for this award but especially also thanks to everyone who contributed data, software and help with putting this all together! And let us continue to help people in need recognizing the importance of open and transparent language technology and the responsibilities we have in society. Thank you!