Tommi Jouhiainen is defending his PhD thesis in language technology on “Language Identification in Texts” on Tuesday, May, 28 at 12 in auditorio XII in the main building of the university. The thesis is available at http://urn.fi/URN:ISBN:978-951-51-5131-5 and the opponent is Nikola Ljubešić from the Jožef Stefan Institute in Ljubljana and the University of Zagreb.
In both the news and multimodal translation tasks, the best systems from the Language Technology group utilised state-of-the-art neural machine translation models.
System papers describing the models will be presented at EMNLP 2018 Third Conference on Machine Translation (WMT18) later this year.
Congratulations to our team!
We organise an event on representation learning from multilingual language data (FoTran2018). We have great invited speakers:
- Kyunghyun Cho, NYU, New York
- Manaal Faruqui, Google
- André Martins, Unbabel, Lisbon
- Ivan Vulić, University of Cambridge
- Željko Agić, IT University of Copenhagen
Sign up if you want to participate or even present your work! Participation is free but registration is required. More info here: https://blogs.helsinki.fi/language-technology/fotran-2018/
The Academy of Finland funded research project Natural Language Understanding with Cross-Lingual Grounding was presented in the Academy of Finland opening seminar Novel Applications of Artificial Intelligence in Physical Sciences and Engineering Research (AIPSE) on 18 June by Dr. Alessandro Raganato and Dr. Hande Celikkanat.
Our presentation and the poster attracted a lot of interest from the seminar participants.
Our NLPL project was featured in NordForsk newsletter with a link to this article!
Helsinki Language Technology group had 5 papers in the recent Digital Humanities in the Nordic Countries (DHN 2018) Conference held on 7–9 March 2018 in Helsinki.
Distinguished Short Paper
- Jörg Tiedemann: Emerging Language Spaces Learned From Massively Multilingual Corpora [pdf]
- Emily Öhman, Kaisla Kajava: Sentimentator: Gamifying Fine-grained Sentiment Annotation [pdf]
- Yves Scherrer, Tanja Samardžić: ArchiMob: A multidialectal corpus of Swiss German oral history interviews [pdf]
- Seppo Nyrkkö: An approach to unsupervised ontology term tagging of dependency-parsed text using a Self-Organizing Map (SOM) [pdf]
- Mika Hämäläinen, Tanja Säily, Eetu Mäkelä: Normalizing Early English Letters for Neologism Retrieval [pdf]
There are several projects that will start during spring 2018. One of them is the ERC-funded project Found in Translation (FoTran). We are currently looking for motivated people with a background in computational linguistics or computer science to join our team. Please, get in touch with us (via e-mail to jorg.tiedemann at helsinki.fi) if you are interested in doing your post-doctoral research or a PhD within the scope of the project!
We are also looking for a university lecturer in language technology. More information about this positions is available from the university’s job opportunity page.
A new project will start in our research group with the title “Found in Translation: Natural Language Understanding with Cross-Lingual Grounding“ funded by the ERC. It will focus on representation learning from multilingual data. More information will come soon! Here is an illustration to motivate the idea behind the project: