Finno-Ugrian Researcher Discovers Linguistic Treasures Every Day

We recently published the first material produced in the continued Digitisation Project of Kindred Languages in the Fenno-Ugrica collection, a total of 75 monographs in the Mari languages. To discuss this material, we met with Finno-Ugrian researcher Mrs Julia Kuprina, a project researcher at the Morphological Analyzers for Minority Finno-Ugrian Languages project. We spoke with her about the material in the collections, her own research in language technologies, and naturally also the Hill Mari language.

Could you tell us a little about your own researcher background and your work at the moment?

I began my work with Hill Mari, my native language, in Finno-Ugrian studies at the University of Tartu, where I researched verb forms (Hill Mari has five of them) and their use for my seminar essay which later became my Master’s thesis. After graduation I moved to Finland in 2001, but I could not find work in my field, so I got an education in marketing and sales. That has been my occupation for approximately a decade. At the same time, however, I have continued researching Hill Mari in my spare time. During maternity and parental leave I translated children’s books and other texts into my native language.

When I was offered the opportunity to work on my native language just over a year ago, I was delighted to join the research group. Our project manager, Jack Rueter, encouraged me to overcome my initial hesitation towards language technology and my work started to go well. I am currently involved in glossing (processing lists of lexemes, e.g., translating Hill Mari words into Finnish) and the the definition of continual lexicons. More information can be found on the Giellatekno website maintained by the University of Tromso, more on the work on Hill Mari here

What questions does research in language technology answer; what is the aim of your project?

Our project is based on open-source language technology and caters to the needs of the general public in our digitised world. We are working on morphological analyzers that can be reused as sophisticated online dictionaries of the languages involved in the project, to help both linguists and language enthusiasts. At the moment, the dictionary of Hill Mari includes approximately 22,000 lexemes. The dictionary also features a morphological analyser, which allows users to search for a particular word in an extensive array of its inflections. Hill Mari school pupils and learners of the language will also surely benefit from the bilingual online dictionary which includes examples of how the words are used and sample phrases. Language professionals such as teachers and journalists are also served by a derivative of our analyzer in the Voikko proofreading application for Hill Mari. Our current work on the analyzer is immediately seen in the improvement of the Hill Mari spellchecker beta for testing.

The project has also been able to utilize development at Giellatekno in a browser-based application that makes it possible for any one fluent in Finnish to read old Finno-Ugrian newspapers or books just by alt-doubleclicking on a word in the pdfs. The National Library has now provided a great deal of material in the kindred languages of Finland for open use in the Fenno-Ugrica collection. The online application, a bookmarklet is already helping users read the material.

The materials in Fenno-Ugrica for both the pilot and continuation stage of the Digitisation Project for Kindred Languages were selected in cooperation with researchers. The intention has been to engage researchers in the cooperation already in the planning stage of the project, since they have the most comprehensive understanding of the accessibility of relevant materials as well as their research impact. Last August, the Digitisation Project for Kindred Languages arranged a researcher meeting which led to the drafting of a grant application to continue the project. The application was submitted to the funder at the end of September.

Several criteria, defined in cooperation between the National Library, its partner libraries and researchers, were employed in the selection of the materials. The key criterion was the creation and establishment of the contemporary written language. Material from the time when the written language was being established is also important for activists seeking to preserve the language today. Neologisms from the 1920s and 1930s as well as texts that use them serve as both source material and a source of innovation and inspiration for the developers of the contemporary languages. The works were proposed for digitisation so that they would not only represent the innovative 1920s accurately, but also reflect the changes in language policy which occurred in the 1930s.

Mrs Kuprina, you participated in proposing material for digitisation, and many of the works you proposed have now been published in Fenno-Ugrica. What type of language material is this?

At the time I wanted to propose more books in my native language for digitisation, but I’m really grateful to you and Kone-Foundation – that we researchers now have as much Hill Mari material at our disposal as we do. It’s a wonderful collection of rare material from the 1920s and 1930s! The Hill Mari language has not been widely researched, so there’s a lot of ground to cover.

Personally I’m interested in the development of the Hill Mari orthography. Its development can be roughly divided into four stages, which emphasises the contrast between the way words in minority languages were written in the beginning of the last century and contemporary spellings. I based my proposals on books written or edited by Mari people when selecting the material to ensure a natural linguistic expression, as free from outside influences as possible. Russia began developing its minority languages in the 1920s, and the books reflect the consequent excitement of discovering and creating words and phrases to describe new concepts.

You mentioned you discover linguistic treasures every day?

Every time I open a new newspaper or monograph, I make fascinating discoveries. My latest discovery was when I read the first official Hill Mari orthography from 1940. I had been waiting to read the book for a long time, and was surprised how similar it was to later orthographies. The end of the 1930s saw a total departure from the previous spellings which were more Mari-based. The efforts to Russify the Mari language are already evident in the 1940 orthography. The word “finger” went from парньа to парня, “to study” from тымэньӓш to тыменяш, and “pants” from йалаш to ялаш, etc.

What significance does the Fenno-Ugrica collection hold for your work? How do you use the collection in your research?

The collection has tremendous significance for my work. I am currently in the process of adding my lexical discoveries to our online dictionary in hopes that they will be used in the future. I’m hoping to find specialist terminology from old textbooks. For example, it would be interesting from the perspective of linguistic history to study the anatomical vocabulary of Hill Mari. We currently have no Hill Mari word for the lens of the eye – it is referred to with the Russian word. This is of course the result of the 1938 decision that banned minority peoples from using terminology in their own languages in school textbooks.

It is important for our online dictionary to discover more (native) words which cannot be found in printed dictionaries. This will also improve the example sentences and phrases. The unique Fenno-Ugrica collection will hold even greater significance for future research. It is also likely that school children will find the textbooks of their great-grandparents interesting along with the other digitised material, since they offer an avenue for analysing their own identity and gaining more information on the history of the Hill Mari region. As a reader, I found the history of my native region springing to life as in a documentary film when I read old Hill Mari newspapers.

Julia Kuprina was interviewed by Jussi-Pekka Hakkarainen

***

Additional Information Regarding the Project Open-source Language Technology for Uralic Minority Languages:

The project is based on open-source language technology and tries to cater to the needs of the general public in our digitised world. We are working on a sophisticated online dictionary of the languages involved in the project, which aims to help linguists, language enthusiasts and users of the language in general.

We were surprised to learn how wide the range of applications for morphological analysers is, as compared to our expectations, in the course of our work process. The plan for 2013-2014 was to create a morphological transductor based on the Giellatekno user interface for the languages covered by our project (Olonets Karelian, Livonian, Tundra Nenets, Moksha, Hill Mari), using Finnish open-source technology, following in the footsteps of work carried out for Finnish and Northern Saami. Every language’s transductor has a lexicon, on the basis of which we create a multi-faceted derivationn system.

At the moment, the dictionary of Hill Mari includes approximately 22,000 lexemes and serves as free web dictionary, in constellation with the analyser and the glossing. It can be utilised in reading Hill Mari websites, including Wikipedia and blog articles and others. The morphological analyser is a good tool for linguists aiming to verify their hypotheses and for panguage enthusiasts aiming ti improve their language skills. Earlier this year we created a version of the Finnish-Hill Mari dictionary including declensional information for the Finno-Ugric Winter School in Szeged where a Hill Mari course was taught.

Local school children and other learners of the language will also surely benefit from the bilingual online dictionary which includes examples of how the words are used and sample phrases. Language professionals such as teachers and journalists are also served by the Voikko proofreading application in the languages of project. The project is currently working on the application and has released a beta-version for Linux, MacOS ja Windows.

By the end of the year, we will also publish a pedagogical demo, taking solutions created for other languages to consideration.

The project has also worked on a browser-based application which would enable any Finn to read old Finno-Ugrian newspapers or books just by clicking on a source word. The National Library has now provided a great deal of material in the kindred languages of Finland for open use in the Fenno-Ugrica collection. The online application will help users read the material.

Additional information was provided by the project-lead Jack Rueter

19 thoughts on “Finno-Ugrian Researcher Discovers Linguistic Treasures Every Day”

investment ira gold on 31.8.2023 at 19.42 said:

I get what you’re saying, but I want some detail. Can we talk via email?

Reply ↓
lankasri on 3.11.2023 at 17.21 said:

For the reason that the admin of this site is working no uncertainty very quickly it will be renowned due to its quality contents.

Reply ↓
Psychedelic Jukebox on 4.11.2023 at 12.42 said:

Superb post however I was wanting to know if you could write a litte more on this topic? I’d be very grateful if you could elaborate a little bit more.

Reply ↓
news max live on 7.11.2023 at 13.29 said:

This is my first time pay a quick visit at here and i am really happy to read everthing at one place

Reply ↓
ذذز حثقسهشد on 9.11.2023 at 18.26 said:

I do not even understand how I ended up here but I assumed this publish used to be great Watch ذذز حثقسهشد

Reply ↓
News Time Assam on 10.11.2023 at 20.48 said:

You re so awesome! I don t believe I have read a single thing like that before.

Reply ↓
free live horse racing streaming on 15.11.2023 at 2.17 said:

I really like reading through a post that can make men and women think. Also thank you for allowing me to comment!

Reply ↓
Will it ever be possible for time travel to occur? on 15.11.2023 at 2.37 said:

This is really interesting You re a very skilled blogger. I ve joined your feed and look forward to seeking more of your magnificent post.

Reply ↓
How to Listen to SiriusXM Radio Online on 26.11.2023 at 15.08 said:

very informative articles or reviews at this time.

Reply ↓
Newsmax TV Live on 26.11.2023 at 17.36 said:

I like the efforts you have put in this regards for all the great content.

Reply ↓
Trevorsteni on 1.2.2024 at 8.03 said:

I gave https://www.cornbreadhemp.com/collections/cbd-cream a whack at with a view the first adjust, and I’m amazed! They tasted excessive and provided a sense of calmness and relaxation. My importance melted away, and I slept less ill too. These gummies are a game-changer on the side of me, and I greatly put forward them to anyone seeking spontaneous pain alleviation and well-advised sleep.

Reply ↓
Live TV on 3.2.2024 at 8.35 said:

Very nice blog post. definitely love this site.tick with it!<a href="https://www.clients1.google.com.co/url?sa=t

Reply ↓
hq tv open on 9.2.2024 at 18.48 said:

Thank you for great information. look forward to the continuation.-fernsehen live vox

Reply ↓
hot deals on 23.2.2024 at 1.16 said:

For the reason that the admin of this site is working no uncertainty very quickly it will be renowned due to its quality contents.AOOGITF Cat Water Fountain 304 Stainless Steel Pet Water Fountain 24/7 Keep The Water Fresh 74oz Capacity Ultra-Quiet No Sputter Compact Easy Cleaning Suit for Pets – Hot Deals

Reply ↓
hey dudes for men on 10.3.2024 at 2.35 said:

This was beautiful Admin.hank you for your reflections. – hey dude mens shoes

Reply ↓
asics womens running shoes on 14.4.2024 at 7.09 said:

There is some nice and utilitarian information on this site.

Reply ↓
azithromycin 500mg buy on 30.4.2024 at 19.31 said:

azithromycin 250 mg tablet price

Reply ↓
check it out on 7.5.2024 at 11.46 said:

The information you shared is very interesting, learning feels great! their explanation

Reply ↓
cbda oil on 29.5.2024 at 2.03 said:

CBD exceeded my expectations in every way thanks. I’ve struggled with insomnia looking for years, and after tiring CBD like because of the from the word go age, I lastly trained a loaded nightfall of pacific sleep. It was like a arrange had been lifted off my shoulders. The calming effects were calm after all profound, allowing me to drift slow uncomplicatedly without sensibilities punchy the next morning. I also noticed a reduction in my daytime apprehension, which was an unexpected but acceptable bonus. The cultivation was a minute earthy, but nothing intolerable. Comprehensive, CBD has been a game-changer inasmuch as my sleep and uneasiness issues, and I’m thankful to have discovered its benefits.

Reply ↓

Fenno-Ugrica

The Blog of the Minority Languages Project – National Library of Finland

Finno-Ugrian Researcher Discovers Linguistic Treasures Every Day

19 thoughts on “Finno-Ugrian Researcher Discovers Linguistic Treasures Every Day”

Leave a Reply Cancel reply