Fourth International Workshop on Computational Linguistics for Uralic Languages. Organised by ACL SIGUR (and University of Helsinki). 8th–9th January, 2018, Helsinki, Finland


The final proceedings version will be available in the ACL SIGUR section of ACL anthology.


Day 1   9:00    Registration
9:30    Invited talk: How we built the Finnish dependency treebank and all the many things we did with it (Veronika Laippala) (
10:30   poster boasters
11:00   Coffee + posters
12:00   Tracking Typological Traits of Uralic Languages in
Distributed Language Representations (Johannes Bjerva and Isabelle Augenstein)
12:30      New Baseline in Automatic Speech Recognition for Northern Sámi (Juho Leinonen, Peter Smit, Sami Virpioja and Mikko Kurimo) (
13:00   Analysing Finnish with word lists: the DDI approach to
morphology revisited (Atro Voutilainen and Maria Palolahti)
13:30   Towards an open-source universal-dependency treebank for
Erzya (Jack Rueter and Francis Tyers) (
14:00     Lunch (self-paid) 14:30
15:00   Utilization of Nganasan digital resources: a statistical approach to vowel harmony (László Fejes)
15:30     Parallel Forms in Estonian Finite State Morphology (Heiki-Jaan Kaalep)
16:00  Extracting inflectional
class assignment in Pite Saami: Nouns, verbs and those pesky adjectives (Joshua Wilbur)
16:30    Initial Experiments in Data-Driven Morphological Analysis for Finnish (Miikka Silfverberg and Mans Hulden) (
17:00      Coffee + posters
17:30  SIGUR
AGM 18:00

20:00 Bryggeri: Sofiankatu 2, 00170 Helsinki
Social dinner (self-paid)

Day 2 Programme

9:00-13:00 Grammar checkers (Trond and Sjur)

Universal Dependencies (Fran)
Rule-based machine translation: FST, CG, … (Tommi)


*) Following posters will be presented in poster session:

– Development of an Open Source Natural Language Generation  Tool for
Finnish (Mika Hämäläinen and Jack Rueter)
– Guessing lexicon entries using finite-state methods (Kimmo Koskenniemi)
– Dependency Parsing of Code-Switching Data with Cross-Lingual Feature
Representations (Niko Partanen, Kyungtae Lim, Michael Rießler and Thierry Poibeau)
– Sound-aligned corpus of Udmurt dialectal texts (Timofey Arkhangelskiy and Ekaterina Georgieva)
– A Finnish News Corpus for Named Entity Recognition ()
– Building a Finnish SOM-based ontology concept tagger and harvester (Seppo Nyrkkö)

Poster authors should prepare a flash-talk (poster boaster) of no
longer than 3 minutes.


Unioninkatu 40 (Metsätalo)
Helsingin yliopisto
Helsinki, Finland
Lecture hall (sali) 12, 3rd floor


To register for the workshop please fill out registration form. NB: there is an optional 50 euro fee for participation that will be used to cover running costs.

(Use the above link for registration)

Invited speaker

Veronika Laippala
“How we built the Finnish dependency treebank and all the many things we did with it”

Call for papers

The purpose of the conference series International Workshop on Computational Linguistics for Uralic Languages is to bring together researchers working on computational approaches to working with these languages. We accept long and short papers as well as tutorial proposals working on the following languages: Finnish, Hungarian, Estonian, Võro, the Sámi languages, Komi (Zyrian, Permyak), Mordvin (Erzya, Moksha), Mari (Hill, Meadow), Udmurt, Nenets (Tundra, Forest), Enets, Nganasan, Selkup, Mansi, Khanty, Veps, Karelian (Olonets), Karelian, Ingrian (Izhorian), Votic, Livonian, Ludic, and other related languages.

All Uralic languages exhibit rich morphological structure, which makes processing them challenging for state-of-the-art computational linguistic approaches, the majority also suffer from a lack of resources and many are endangered.

Research papers should be original, substantial and unpublished research, that can describe work-in-progress systems, frameworks, standards and evaluation schemes. Demos and tutorials will present systems and standards towards the goal of interoperability and unification of different projects, applications and research groups Appropriate topics include (but are not limited to):

  • Parsers, analysers and processing pipelines of Uralic languages
  • Lexical databases, electronic dictionaries
  • Finished end-user applications aimed at Uralic languages, such as spelling or grammar checkers, machine translation or speech processing
  • Evaluation methods and gold standards, tagged corpora, treebanks
  • Reports on language-independent or unsupervised methods as applied to Uralic languages
  • Surveys and review articles on subjects related to computational linguistics for one or more Uralic languages
  • Any work that aims at combining efforts and reducing duplication of work
  • How to elicit activity from the language community, agitation campaigns, games with a purpose
  • To maximise the possibility of reproducibility, replication and reuse, we particularly encourage submissions which present free/open-source language resources and make use of free/open-source software.

One of the aims of this gathering is to avoid unnecessary duplicated work in field of Uralistics by establishing connections and interoperability standards between researchers and research groups working at different sites. We have also identified a serious lack of gold standards and evaluation metrics for all Uralic languages including those with national support, any work towards better resources in these fields will be greatly appreciated. In this year’s edition, we continue our tradition of particularly encouraging researchers of minority Uralic languages in Russia to participate. <>

Important dates

  • 3rd July 2017: Call for papers announced
  • 1st October 2017 2nd call for papers
  • 14th 19th November 2017: Paper submission deadline
  • 6th December 2017: Paper notification
  • 23rd December 2017: Camera-ready deadline
  • ?? January 2018: Fill in the registration form
  • 8th9th January 2018: Workshop held in Helsinki

Submission of papers

Language of submission: Submissions should be made in English or Russian with an obligatory abstract in at least one of the Uralic Language(s).

Submission format: There are multiple submission types: long and short research papers, and demonstrations and tutorials. Research papers should be up to 18 pages in length excluding references, the descriptions for demonstrations and tutorials up to 5 pages. Submissions should be formatted using LaTeX default article style with b5paper option. Citations should be managed with bibtex and e.g., unsrt bibliography style. Linguistic glosses should follow Leipzig glossing rules and use expex LaTeX package (make sure to update expex regularly as it is developed actively). Preferred LaTeX version is XeLaTeX and therefore you should use UTF-8 encoded Unicode in your sources rather than TeX encoded characters where possible. You will find the workshop template here (also in zip format templates).

Download template here: <>

If you do not have access to LaTeX text processing system, please contact us for alternative templates and instructions.

Submissions can be made here using the EasyChair conference management system.

Publication venue: Proceedings of the workshop will be published open-access in ACL anthology, SIG proceedings for SIGUR

Conflicts of interest: The reviewing process will be anonymous (double-blind peer review) and authors should state in their submission all conflicts of interest with members of the programme committee. Members of the programme committee are also expected to state their conflicts of interest during review bidding. If the programme committee finds themselves unable to review some of the submissions, external reviewers may be called.

Double submission: To maximise the impact of work in the field of computational linguistics for the Uralic languages we are open to the possibility of double submission, or submission of work which has been partially published elsewhere. Any double submission should however be reported to the programme committee at the time of submission. In the advent of double acceptance the authors should choose in which venue to publish.


Participants from outside the Schengen area may require a visa to visit Finland. If you require an invitation letter confirming your participation, please get in contact with the local organising committee.

A small number of travel stipends will be available for authors of accepted papers. After submitting your paper please contact the organising committee to request consideration.


Hotels maybe found through your preferred online hotel price comparison site. If you have any questions about accommodation, please feel free to contact the organisers.

Suggested hotels:


If you need a Visa, please contact the local organising committee


  • Tommi A. Pirinen, Universität Hamburg
  • Michael Rießler, Albert-Ludwigs-Universität Freiburg
  • Trond Trosterud, UiT Norgga árktalaš universitehta
  • Francis M. Tyers, UiT Norgga árktalaš universitehta / Высшая школа экономики

Local organising committee

  • Jack Rueter, Helsingin yliopisto
  • Jörg Tiedemann, Helsingin yliopisto
  • Krister Lindén, Helsingin yliopisto

Programme committee



Any questions should be directed to the organising committee on