Tommi Jouhiainen is defending his PhD thesis in language technology on “Language Identification in Texts” on Tuesday, May, 28 at 12 in auditorio XII in the main building of the university. The thesis is available at http://urn.fi/URN:ISBN:978-951-51-5131-5 and the opponent is Nikola Ljubešić from the Jožef Stefan Institute in Ljubljana and the University of Zagreb.
- Place: Porthania, Suomen Laki -hall (PIV)
- Date: October 12, 2018
- Time: 14:15 – 15:45
The purpose of this event is to arrange a meeting between students and representatives of the industry that works with language technology in one way or another. The event is open to anyone who is interested in getting information about career opportunities. We will have short presentations of relevant companies and their business and leave time for questions and discussions. There will also be the opportunity to informally speak to the industry representatives face to face.
We have invited various language service providers and LT businesses and the preliminary list of confirmed participants is listed below.
- Delingua Language Services
- Insider Solutions
- Trademark Now
This list is subject to change and more information about the program will be posted later
We organise an event on representation learning from multilingual language data (FoTran2018). We have great invited speakers:
- Kyunghyun Cho, NYU, New York
- Manaal Faruqui, Google
- André Martins, Unbabel, Lisbon
- Ivan Vulić, University of Cambridge
- Željko Agić, IT University of Copenhagen
Sign up if you want to participate or even present your work! Participation is free but registration is required. More info here: https://blogs.helsinki.fi/language-technology/fotran-2018/
Professor Jörg Tiedemann gave a talk at the Charles University in Czech Republic on the 18th June as part of their Fred Jelinek Seminar Series.
Translated texts are semantic mirrors of the original text and the significant variations that we can observe across languages can be used to disambiguate the meaning of a given expression using the linguistic signal that is grounded in translation. We are interested in massively parallel corpora consisting of hundreds up to a thousand different languages and how they can be applied as implicit supervision to learn abstractions that could lead to significant improvements in natural language understanding. As a side-effect, we can also see how multilingual models can pick up essential relationships between languages building a continuous space with reasonable language clusters. I will talk about some initial results and plans for the future and I would like to get your feedback about those ideas.
Our NLPL project was featured in NordForsk newsletter with a link to this article!
Helsinki Language Technology group had 5 papers in the recent Digital Humanities in the Nordic Countries (DHN 2018) Conference held on 7–9 March 2018 in Helsinki.
Distinguished Short Paper
- Jörg Tiedemann: Emerging Language Spaces Learned From Massively Multilingual Corpora [pdf]
- Emily Öhman, Kaisla Kajava: Sentimentator: Gamifying Fine-grained Sentiment Annotation [pdf]
- Yves Scherrer, Tanja Samardžić: ArchiMob: A multidialectal corpus of Swiss German oral history interviews [pdf]
- Seppo Nyrkkö: An approach to unsupervised ontology term tagging of dependency-parsed text using a Self-Organizing Map (SOM) [pdf]
- Mika Hämäläinen, Tanja Säily, Eetu Mäkelä: Normalizing Early English Letters for Neologism Retrieval [pdf]
There are several projects that will start during spring 2018. One of them is the ERC-funded project Found in Translation (FoTran). We are currently looking for motivated people with a background in computational linguistics or computer science to join our team. Please, get in touch with us (via e-mail to jorg.tiedemann at helsinki.fi) if you are interested in doing your post-doctoral research or a PhD within the scope of the project!
We are also looking for a university lecturer in language technology. More information about this positions is available from the university’s job opportunity page.