Releasing the Komi newspapers at Fenno-Ugrica

Last year, we released a plenty of monographs in Komi languages in our online collection, Fenno-Ugrica. In addition to the monographs, we also are publishing newspapers in both, Komi-Permyak and Komi-Zyrian. All in all, there will be 23 titles and around 40 000 pages of Komi newspapers in our collection by the end of June 2015.

Continue reading

Post-prodcution of our digital content

Anis Moubarik, an information system specialist at the National Library and a member of DPKL team, will introduce you to that procedure what happens to a digitized book in our post-production processes. During the project, Anis has been in charge of creating both, OCR’ed PDFs that are available in our Fenno-Ugrica collection and Alto XML files per book, which are made available for editing in Revizor, the text editor for enhancing the data.

Continue reading

Brief technical overview of Revizor, the editor for correcting OCR text material

OCR’ed text is often lacking in quality because of errors during the optical recognition process, especially when the source material is old or otherwise in a bad state. These errors make it hard to rely on the text for building a corpus or word lists and makes the source material less accessible to use for study or to incorporate into other tooling for language researchers. This is a problem that our OCR editor tries to eradicate, or at least contribute a possible solution towards.

Continue reading

Erzya Language Day, April 16th

Today we are celebrating the Erzya language day. Erzya is one of the Uralic languages and it is spoken in the Republic of Mordovia in Russia.

During the Digitization Project of Kindred Languages, we have paid a special attention to the materials published in Mordvinic languages, Erzya, Moksha, Shoksha. Erzya was converted into a medium of popular education, enlightenment and dissemination of information pertinent to the developing political agenda of the Soviet state.

Continue reading

Money Well Spent

Fenno-Ugrica collection was released in June 2013 to support the research of Uralic studies in humanities. At that time, there were around 700-800 downloads on the monthly basis, which wasn’t that bad, I reckon. Actually, I was rather happy to notice that there are people who are using our stuff, even though some might consider the amount of monthly downloads as low.

Continue reading

The 10th of December – The Mari Language Day

The first Mari language grammar book (Sochineniya) was published in Saint-Petersburg in 1775. There is no noted author straight in the book but some researchers suppose that the metropolitan Veniamin Putsek-Grigorovich who was a missionary in the region of Kazan and studied local minority nations at least partly took part in the creation of this book.

The grammar book is the monument of the written Mari and Mari language literature. At the times of 1770, the Mari people were called with Russian language name Cheremis. Mari language has two variants Hill and Meadow Mari each of them could be divided into two other dialects Eastern and North-Western.

Continue reading