The Academy of Finland decided to fund our project proposal on “Green NLP – controlling the carbon footprint in sustainable language technology” from the call on sustainable and energy-efficient ICT solutions. We are looking forward to three years of exciting research and work together with our colleagues from TurkuNLP and CSC.
GreenNLP addresses the problem of increasing energy consumption caused by modern solutions in natural language processing (NLP). Neural language models and machine translation require heavy computations to train and their size is constantly growing, which makes them expensive to deploy and run. In our project we will reduce the training costs and model sizes by clever optimizations of the underlying machine learning algorithms with techniques that make use of knowledge transfer and compression. Furthermore, we will focus on multilingual solutions that can serve many languages in a single model reducing the number of actively running systems. Finally, we will also openly document and freely distribute all our results to enable efficient reuse of ready-made components to further decrease the carbon footprint of modern language technology.
Raul Vazquez from the Fotran team gave a talk at the AI Day 2022 on “A Closer Look at Parameter Contributions When Training Neural Language and Translation Models”. The paper is published at COLING 2022. This is joint work with Hande Celikkanat, Vinit Ravishankar, Mathias Creutz and Jörg Tiedemann and looks at the training dynamics of neural language and translation models using fine-grained loss-change allocation analyses.
- Place: Kielikeskus (Fabianinkatu 26), Juhlasali
- Date: Friday November 25, 2022
- Time: 15:15 – 17:45
Update 28 November: Thanks for attending! You can find the presentation slides here (UH account required).
The purpose of this event is to arrange a meeting between students and representatives of the industry that work with language technology in one way or another. The event is open to anyone who is interested in getting information about career opportunities. We will have short presentations of relevant companies and their business and leave time for questions and discussions. There will also be the opportunity to informally speak to the industry representatives face to face.
We have invited various language service providers and LT businesses and the preliminary list of confirmed participants is listed below:
- Kielikone (Elina Söderblom)
- Lingsoft (Sebastian Andersson)
- Semantix (Teemu Tenhunen)
- Utopia Analytics (Saara Palma-Suominen, Sami Virpioja)
- Sanoma Media Finland (Clemens Westrup)
- Front.AI (Tiila Käenniemi)
- Huawei (Adrian Flanagan)
Please sign up here by Sunday 20 November if you intend to participate. (The registration is not binding, it is just to facilitate the organization.)
The Helsinki-NLP paper on When to Laugh and How Hard? A Multimodal Approach to Detecting Humor and Its Intensity has been selected as one of the outstanding papers at the main conference of COLING 2022. Congratulations to Khalid Alnajjar and Mika Hämäläinen!
Helsinki-NLP received the 2022 Steven Krauwer award for CLARIN achievements for the work on open machine translation for Ukrainian. Thank you very much for this award but especially also thanks to everyone who contributed data, software and help with putting this all together! And let us continue to help people in need recognizing the importance of open and transparent language technology and the responsibilities we have in society. Thank you!
Thimothée Mickus from the Fotran team gave a talk in October in the Machine Learning Coffee Seminars on “Linear structures in Transformer Embedding Spaces”. The talk is available online.
Jörg Tiedemann gave a talk at the Learning Machines Seminar 2022 in Sweden on “Translations as semantic mirrors – Representation learning with multilingual data”. The talk is available online.
Our new project on High-Performance Language Technologies (HPLT) has started and we will scale data sets, language models and neural MT to a new level. In relation to that, the language technology group in Helsinki has also been selected for one of the first Finnish extreme scale projects on the supercomputer LUMI.
Our project there will be called LumiNMT and the goal of the project is to train neural machine translation models on a large scale using state-of-the-art transformer models and novel modular multilingual setups. Our project will focus on increasing language coverage and efficient use of massively parallel data sets. Our research group wants to use LUMI’s extensive parallel computing capabilities to reduce training time and scale up a model size.
The presentations from our workshop on Uncertainty and Grounding in Language and Translation Models (UnGroundNLP) are now on-line and can be access from our workshop website: https://blogs.helsinki.fi/fcai-sig-lsc/ungroundnlp-2022/
Thanks to all speakers and participants for a great day with plenty of interesting discussions!