About this project

DigiSami was a research project by University of Helsinki funded by Academy of Finland. Its goal was to investigate how modern techniques of corpus linguistics and language technology can be applied in order to support the revitalisation of less resourced languages. The language selected as the focus of the project was North Sami, spoken in several northern areas of Finland and Norway.

We collected and annotated a North Sami spoken language corpus, the DigiSami Corpus. The corpus was collected in North Sami-speaking areas of both Finland and Norway. The annotations were created using modern corpus linguistics techniques. The speech corpus was made available to our partners in Aalto University who worked on North Sami speech technology.

In language technology work, we developed techniques for localisation of spoken dialogue systems. We worked towards a proposed interactive robot dialogue system, SamiTalk, in which a robot will talk in North Sami about a wide range of topics using information from Sami Wikipedia. We believe our prototype to demonstrate SamiTalk was the world’s first Sami-speaking robot.

We organised an international workshop, IWSDS 2016, at Saariselkä in Finnish Lapland. This was the northernmost International Workshop on Spoken Dialogue Systems in the IWSDS series. More than fifty researchers came to the workshop from Japan, USA and different parts of Europe. Based on the many high-quality papers presented at the workshop, we edited the book: Dialogues with Social Robots – Enablements, Analyses, and Evaluation, Springer, 2017.

Our recent research focussed on multimodal analysis of spoken dialogues. Using machine learning techniques, we found correlations between dialogue topics, speakers’ body movements, laughter and speech, based on the audio and video recordings and annotations in the DigiSami Corpus. We also collaborated with Ville Hautamäki (University of Eastern Finland) on dialect recognition for North Sami.

***Best Paper Award***
The paper Enabling Spoken Dialogue Systems for Low-resourced Languages: End-to-end Dialect Recognition for North Sami by Trung Ngo Trong, Kristiina Jokinen and Ville Hautamäki, won the Best Paper Award at 9th International Workshop on Spoken Dialogue Systems (IWSDS 2018). Fulltext.

Trung receiving the award at IWSDS 2018 in Singapore.