Digitization Project of Kindred Languages will be present at the Helsinki Book Fair 2015 this autumn. The Book Fair will take place between the 22nd and 25th of October at the Expo and Convention Centre in Helsinki.
Releasing the Komi newspapers at Fenno-Ugrica
Last year, we released a plenty of monographs in Komi languages in our online collection, Fenno-Ugrica. In addition to the monographs, we also are publishing newspapers in both, Komi-Permyak and Komi-Zyrian. All in all, there will be 23 titles and around 40 000 pages of Komi newspapers in our collection by the end of June 2015.
Post-prodcution of our digital content
Anis Moubarik, an information system specialist at the National Library and a member of DPKL team, will introduce you to that procedure what happens to a digitized book in our post-production processes. During the project, Anis has been in charge of creating both, OCR’ed PDFs that are available in our Fenno-Ugrica collection and Alto XML files per book, which are made available for editing in Revizor, the text editor for enhancing the data.
Brief technical overview of Revizor, the editor for correcting OCR text material
OCR’ed text is often lacking in quality because of errors during the optical recognition process, especially when the source material is old or otherwise in a bad state. These errors make it hard to rely on the text for building a corpus or word lists and makes the source material less accessible to use for study or to incorporate into other tooling for language researchers. This is a problem that our OCR editor tries to eradicate, or at least contribute a possible solution towards.
Erzya Language Day, April 16th
Today we are celebrating the Erzya language day. Erzya is one of the Uralic languages and it is spoken in the Republic of Mordovia in Russia.
During the Digitization Project of Kindred Languages, we have paid a special attention to the materials published in Mordvinic languages, Erzya, Moksha, Shoksha. Erzya was converted into a medium of popular education, enlightenment and dissemination of information pertinent to the developing political agenda of the Soviet state.
The Electronic Collection of the Murmansk State Regional Universal Scientific Library in Skolt Sami is linked to Uralica
The collection of children books and school books in Skolt Sami language links to which were given up to the Project by Murmask State Regional Library is available at Uralica. Continue reading
Money Well Spent
Fenno-Ugrica collection was released in June 2013 to support the research of Uralic studies in humanities. At that time, there were around 700-800 downloads on the monthly basis, which wasn’t that bad, I reckon. Actually, I was rather happy to notice that there are people who are using our stuff, even though some might consider the amount of monthly downloads as low.
The 10th of December – The Mari Language Day
The first Mari language grammar book (Sochineniya) was published in Saint-Petersburg in 1775. There is no noted author straight in the book but some researchers suppose that the metropolitan Veniamin Putsek-Grigorovich who was a missionary in the region of Kazan and studied local minority nations at least partly took part in the creation of this book.
The grammar book is the monument of the written Mari and Mari language literature. At the times of 1770, the Mari people were called with Russian language name Cheremis. Mari language has two variants Hill and Meadow Mari each of them could be divided into two other dialects Eastern and North-Western.
Recap: HCAS Symposium: Big Data Approaches to Intellectual and Linguistic History, 1-2 December 2014, Helsinki
The past two days were spent at the premises of Helsinki Collegium of Advanced Studies, where I participated the Big Data Approaches to Intellectual and Linguistic History Symposium. In this blog entry, I will grasp briefly some discussed topics.
National Library of Russia joins Uralica
We are proud to announce that the latest, and extensive, set of linked materials in Uralica portal has just been released. This time, the National Library of Russia has offered more than 2300 links to the digitized monographs and newspapers in various Uralic languages.