Brief technical overview of Revizor, the editor for correcting OCR text material

Posted on 30.4.2015 by Jussi-Pekka Hakkarainen

2,390

OCR’ed text is often lacking in quality because of errors during the optical recognition process, especially when the source material is old or otherwise in a bad state. These errors make it hard to rely on the text for building a corpus or word lists and makes the source material less accessible to use for study or to incorporate into other tooling for language researchers. This is a problem that our OCR editor tries to eradicate, or at least contribute a possible solution towards.

Continue reading →

Erzya Language Day, April 16th

Posted on 16.4.2015 by Jussi-Pekka Hakkarainen

4,826

Today we are celebrating the Erzya language day. Erzya is one of the Uralic languages and it is spoken in the Republic of Mordovia in Russia.

During the Digitization Project of Kindred Languages, we have paid a special attention to the materials published in Mordvinic languages, Erzya, Moksha, Shoksha. Erzya was converted into a medium of popular education, enlightenment and dissemination of information pertinent to the developing political agenda of the Soviet state.

Continue reading →

The Electronic Collection of the Murmansk State Regional Universal Scientific Library in Skolt Sami is linked to Uralica

Posted on 10.4.2015 by Victoria Kurkina

The collection of children books and school books in Skolt Sami language links to which were given up to the Project by Murmask State Regional Library is available at Uralica. Continue reading →

Fenno-Ugrica

The Blog of the Minority Languages Project – National Library of Finland

Monthly Archives: April 2015

Brief technical overview of Revizor, the editor for correcting OCR text material

Erzya Language Day, April 16th

The Electronic Collection of the Murmansk State Regional Universal Scientific Library in Skolt Sami is linked to Uralica