The 7th World Congress of the Finno-Ugric Peoples took place last week in Lahti. The World Congress unites the Finno-Ugric and Samoyed peoples at a joint forum, aiming to discuss the issues dealing with the preservation and reviving of the languages and cultures of these peoples, as well as issues concerning the rights of indigenous and minority peoples.
The main theme of the Congress is ”The Finno-Ugric peoples – towards sustainable development” and this main themes was split into five thematic sessions. The second session, Finno-Ugric information environment: future development prospects, was focusing on 1) experiences gained from the operation of national cultural centers and 2) native languages in the course of advanced information technologies.
Naturally, the second session was the main focus for our project too, since we have worked in this field for the past 4.5 years and released close to 30 000 digitized items with their datasets in 18 Uralic languages online. It can be argued that the National Library of Finland has been acting as a data provider for those who want to develop language tools for small Uralic languages.
The session was opened by Marina Fedina of The Finno-Ugric Laboratory for Support of the Electronic Representation of Regional Languages (FU-Lab, Syktyvkar, Komi Republic), who gave a historical overview on the progress of the main theme since 1996. According to statistics shown by Fedina, there has only been a little progress made language by language in twenty years’ time and not all the minority languages in Russia have all the needed language resources and tools available in order to render the Finno-Ugric languages as languages of information environment.
The lack of spell-checkers, purposeful corpora and native character sets, for instance, do make a great hindrance for some of the languages to be used in the internet as tools of communication. The preliminary results in work done on Finno-Ugric Languages and the Internet confirms this hypothesis; some of the languages are strong and vivid enough to be used in this environment, whereas some of them don’t really exist in the Internet.
The core problem has been that the work has been conducted locally, language by language, and since there hasn’t been enough good practice and expertise to run the tasks, the work has remained undone in some languages. The decisions made at earlier congresses have not been taken into practice and the majority of the work has remained undone, eventhough legislation in the Russian Federation supports the tasks.
Good times, however, are rolling. Fedina told us that there has been an initiative to appoint the FU-Lab as a federal-level aggregator to develop the language technology resources and tools for Finno-Ugric languages in Russia. This would mean that FU-Lab would receive federal money either to establish a new organization or develop FU-Lab to take over the responsibilities regarding Finno-Ugric languages in Russia.
In to my opinion, this initiative is more than welcomed and I fully support it. Marina Fedina and her FU-Lab has all that it takes to make these things happen if only they get the needed resources from the State. My only concern is related to those languages which haven’t received an official state language status in Russia – will these languages be endowed with the same tools and resources?
Jack Rueter (University of Helsinki) was participating the thematic session as well and he believes that FU-Lab could play a key role setting the pace for IT development in minority languages of the Russian Federation.
Jack says, “FU-Lab has developed literary corpora for the Komi Zyrian and Permyak languages (over 20 million words), which is work that can be modeled in other languages. Future work ought to include parallel corpora for an improved understanding of interlingual relations.
FU-Lab has taken the initiative and developed keyboards for writing various languages. This, of course, is not always enough. Future development will require language-specific description, so that individual langauges will automatically be recognized according to to input keyboard, and there is also a need for further Unicode implementation in both modern and historical language documentation.
FU-Lab has developed much of its infrastructure interacting with open-source specialists both domestically and abroad. This is a key to maintaibable language development. By bringing Komi into the open-source IT media, FU-Lab has also gotten to a point where it can begin the training of native specialists both in further IT development and translation. Ongoing development in indigenous language on a local level will offer a genuine model for others to follow.»
Summa summarum. When comparing this resolution of the Congress with the others, more abstract ones, this initiative sounds practical enough to be realized and I personally do believe that the FU-Lab is eligible to run this project and spread the knowledge amongst the Finno-Ugric peoples.
Minority Languages Project
National Library of Finland