Tommi Jouhiainen is defending his PhD thesis in language technology on “Language Identification in Texts” on Tuesday, May, 28 at 12 in auditorio XII in the main building of the university. The thesis is available at http://urn.fi/URN:ISBN:978-951-51-5131-5 and the opponent is Nikola Ljubešić from the Jožef Stefan Institute in Ljubljana and the University of Zagreb.
FCAI hosted Research Insight Event at Aalto university. The purpose of the day was to bring together FCAI’s academics and partners to discuss shared problems and to workshop future research ideas for the center.
Found in translation (FoTran) was presented by Jörg Tiedemann.
“Natural Language Processing is not only an important component of intelligent systems that interact with users but also develops into a core discipline of AI that aims at learning world knowledge from human communication. The goal of our research is to use translations to pick up essential semantics of natural language. Combining multilingual signals with multimodal grounding we aim at improved models for natural language understanding.”
- Place: Porthania, Suomen Laki -hall (PIV)
- Date: October 12, 2018
- Time: 14:15 – 15:45
The purpose of this event is to arrange a meeting between students and representatives of the industry that works with language technology in one way or another. The event is open to anyone who is interested in getting information about career opportunities. We will have short presentations of relevant companies and their business and leave time for questions and discussions. There will also be the opportunity to informally speak to the industry representatives face to face.
We have invited various language service providers and LT businesses and the preliminary list of confirmed participants is listed below.
- Delingua Language Services
- Insider Solutions
- Trademark Now
This list is subject to change and more information about the program will be posted later
In both the news and multimodal translation tasks, the best systems from the Language Technology group utilised state-of-the-art neural machine translation models.
System papers describing the models will be presented at EMNLP 2018 Third Conference on Machine Translation (WMT18) later this year.
Congratulations to our team!
We organise an event on representation learning from multilingual language data (FoTran2018). We have great invited speakers:
- Kyunghyun Cho, NYU, New York
- Manaal Faruqui, Google
- André Martins, Unbabel, Lisbon
- Ivan Vulić, University of Cambridge
- Željko Agić, IT University of Copenhagen
Sign up if you want to participate or even present your work! Participation is free but registration is required. More info here: https://blogs.helsinki.fi/language-technology/fotran-2018/
Professor Jörg Tiedemann gave a talk at the Charles University in Czech Republic on the 18th June as part of their Fred Jelinek Seminar Series.
Translated texts are semantic mirrors of the original text and the significant variations that we can observe across languages can be used to disambiguate the meaning of a given expression using the linguistic signal that is grounded in translation. We are interested in massively parallel corpora consisting of hundreds up to a thousand different languages and how they can be applied as implicit supervision to learn abstractions that could lead to significant improvements in natural language understanding. As a side-effect, we can also see how multilingual models can pick up essential relationships between languages building a continuous space with reasonable language clusters. I will talk about some initial results and plans for the future and I would like to get your feedback about those ideas.
The Academy of Finland funded research project Natural Language Understanding with Cross-Lingual Grounding was presented in the Academy of Finland opening seminar Novel Applications of Artificial Intelligence in Physical Sciences and Engineering Research (AIPSE) on 18 June by Dr. Alessandro Raganato and Dr. Hande Celikkanat.
Our presentation and the poster attracted a lot of interest from the seminar participants.
Our NLPL project was featured in NordForsk newsletter with a link to this article!
Helsinki Language Technology group had 5 papers in the recent Digital Humanities in the Nordic Countries (DHN 2018) Conference held on 7–9 March 2018 in Helsinki.
Distinguished Short Paper
- Jörg Tiedemann: Emerging Language Spaces Learned From Massively Multilingual Corpora [pdf]
- Emily Öhman, Kaisla Kajava: Sentimentator: Gamifying Fine-grained Sentiment Annotation [pdf]
- Yves Scherrer, Tanja Samardžić: ArchiMob: A multidialectal corpus of Swiss German oral history interviews [pdf]
- Seppo Nyrkkö: An approach to unsupervised ontology term tagging of dependency-parsed text using a Self-Organizing Map (SOM) [pdf]
- Mika Hämäläinen, Tanja Säily, Eetu Mäkelä: Normalizing Early English Letters for Neologism Retrieval [pdf]