The presentations from our workshop on Uncertainty and Grounding in Language and Translation Models (UnGroundNLP) are now on-line and can be access from our workshop website: https://blogs.helsinki.fi/fcai-sig-lsc/ungroundnlp-2022/
Thanks to all speakers and participants for a great day with plenty of interesting discussions!
A blog post about OPUS by RWS about the importance of OPUS and the development of transparent machine translation with a good language coverage.
In response to the on-going crisis in Ukraine we have started to collect language tools and resources that support the Ukrainian language. At Helsinki-NLP, we have especially focused on the development of open translation models and tools and we are currently working on improved models for more language pairs. Hopefully some of them can help communication and interaction with people in help.
An article about the importance of language technology has been published in “språkbruk” from the Institute for the Languages of Finland. The article discusses that we need to be careful about our language data and that we need to make en effort to develop transparent and open technology to handle valuable information we produce.
Finnish dialects create a lot of trouble when interacting with computers, since it is impossible to speak a language without speaking in a dialect of some sort. Mika Hämäläinen, Niko Partanen, Khalid Alnajjar and Jack Rueter from our language technology team have created software that can automatically detect, normalize and generate Finnish dialects. Their research made it to the news on our university website.
The Language Bank of Finland and the Swedish Literary Society in Finland are collecting Finland-Swedish speech data (https://doneraprat.fi) (see YLE article with video: Vill du att röststyrning ska fungera på finlandssvenska? Kom med och donera prat [Do you want voice control to work in Finland-Swedish? Come and donate speech]). Note that the campaign for collecting spoken Finnish also continues (https://lahjoitapuhetta.fi/).
If you wish to know how the database may affect everyday life and how it can be used in research, listen to the YLE podcast “Second Last Word” (YLE pod: Så här lär sig din dammsugare finlandssvensk dialekt [How your vacuum cleaner learns Finnish-Swedish dialect] and YLE article: Pratande kylskåp och smarta glasögon hjälper dig handla mat – det här är den röststyrda framtiden [Talking refrigerators and smart glasses help you buy food – this is the voice-controlled future]).
Yuri Balashov, professor of philosophy at the Institute for Artificial Intelligence at the University of Georgia, published a review article about OPUS-CAT at the ATA Chronicle:
OPUS-CAT: A State-of-the-Art Neural Machine Translation Engine on Your Local Computer
We are looking for a post-doctoral researcher for a two-year appointment starting on September 1, 2021 at the latest as part of the ERC project FoTran (Found in Translation).
Deadline: May 31, 2021
The FoTran project (http://www.helsinki.fi/fotran
) focuses on cross-lingual sentence representation learning and multilingual neural machine translation. The goal is to learn highly abstract language-agnostic semantic representations from massively parallel data sets and we emphasize extrinsic evaluations (using various downstream tasks) and intrinsic assessment of the models to understand and interpret neural models and distributed representations. The position opened in this call is meant to further strengthen our work on interpretability and explainability of multilingual neural language and translation models and we are seeking for strong candidates with a background in computational linguistics or related fields.
More information about the position and application procedures can be found here:
We welcome applications until May 31, 2021 with the following required attachments:
- Motivation letter including ideas for project-related research work (1-4 pages)
- A curriculum vitae
- List of publications
- The degree certificate of the PhD degree (or documentation of public defence permission)
- Two names of referees who would be available to provide references on demand
Recommendation letter(s) from previous supervisors or employers can optionally also be attached to the application.
Further information: Jörg Tiedemann (jorg.tiedemann AT helsinki DOT fi)
Our ELG project OPUS-MT is mentioned in the IEEE Spectrum post on machine translation within the European Language Grid:
Another early-stage project is coming from Jörg Tiedemann at the University of Helsinki, who is working with colleagues to develop open translation models for the Grid. These models use deep neural networks—layered software architectures that implement complex mathematical functions—to map text into numeric representations. Using data sets to train the models to find the best ways to solve problems takes a lot of computing power and is expensive. Making the models available for re-use will help developers build tools for low-density languages. “Minority languages get too little attention because they are not commercially interesting,” Tiedemann says. “This gap needs to be closed.”
Are you an ambitious researcher looking for an interesting postdoc, research fellow or PhD position?
The Finnish Center for Artificial Intelligence FCAI offers a possibility for 22 new researchers to join a unique research community with an attractive joint mission. Especially interesting for NLP researchers: Topic 11: Interactive AI using multimodal communication
FCAI welcomes applicants with diverse backgrounds, and qualified female candidates are explicitly encouraged to apply. The deadline for applications is October 5, 2020 (midnight UTC+02:00). Read more and apply here: https://fcai.fi/we-are-hiring