CIFU XII, Day 2

Posted on 18.8.2015 by Jussi-Pekka Hakkarainen

What did I learn from the second day of CIFU XII? Two things at least: for a linguist layman like me, I found it interesting to follow how differently the language documentation may be defined. As a librarian, I was thrilled to see that the people in this field are taking archiving seriously. These are the topics I want to grasp in this blog entry too.

Continue reading →

CIFU XII, Day 1

Posted on 18.8.2015 by Jussi-Pekka Hakkarainen

So, the 12th International Congress of Finno-Ugric Studies has finally begun. Despite the fact that Mr. Harri Mantila implicated that the congress has become somewhat tinier than before, we are pleased to enjoy about 111 long papers and 195 presentations in 19 symposia. The CIFU XII has around 380 participants from 21 countries, so I wouldn’t consider this event as a small rendez-vous at all.

Continue reading →

Congressus Duodecimus Internationalis Fenno-Ugristarum, 17–21.8.2015, Oulu

Posted on 16.8.2015 by Jussi-Pekka Hakkarainen

691

This is probably once in a lifetime experience: I am actually excited to come back to work from my summer holidays. My eagerness is due to the 12th International Congress for Finno-Ugric Studies, or CIFU XII, which is about to take place during this week at Oulu, Finland.

Continue reading →

DH2015. Recap, Day 5

Posted on 3.7.2015 by Jussi-Pekka Hakkarainen

2,041

I had spent four days in DH2015 and I hadn’t really chosen the sessions as a historian or a philologist in me would have wanted. No, there wasn’t anyone in my organization, who would have prompted me to participate any precise session in particular, but when going to the conferences, I tend to attend the sessions, which could provide some new information for my home organization in return. By intention, I chose the sessions of the last day according to my own interests and finally I was picking cherries too.

Continue reading →

DH2015. Recap, Day 4

Posted on 2.7.2015 by Jussi-Pekka Hakkarainen

Well, well, well. The fourth day of DH2015 was packed with intensively discussed debates, which must be taken into consideration in following editions of this fine event.

Continue reading →

DH2015. Recap, Day 3

Posted on 1.7.2015 by Jussi-Pekka Hakkarainen

So, the after the workshops, the DH2015 finally took off. These are my observations from the first conference day.

Continue reading →

What Did I Learn from the DH2015 Workshops? Recap, Days 1-2

Posted on 30.6.2015 by Jussi-Pekka Hakkarainen

2,058

The DH2015 is taking place during this week in Sydney, Australia. Digitization Project of Kindred Languages will be present here as I was enabled to have a long paper on Nichesourcing of Uralic Languages later this week. Yesterday and today, I was attending the pre-conference workshops. This is a brief summary on my experiences in three workshops.

Continue reading →

Digitization Project of Kindred Languages goes Helsinki Book Fair 2015

Posted on 11.6.2015 by Jussi-Pekka Hakkarainen

Digitization Project of Kindred Languages will be present at the Helsinki Book Fair 2015 this autumn. The Book Fair will take place between the 22^nd and 25^th of October at the Expo and Convention Centre in Helsinki.

Continue reading →

Post-prodcution of our digital content

Posted on 13.5.2015 by Jussi-Pekka Hakkarainen

2,502

Anis Moubarik, an information system specialist at the National Library and a member of DPKL team, will introduce you to that procedure what happens to a digitized book in our post-production processes. During the project, Anis has been in charge of creating both, OCR’ed PDFs that are available in our Fenno-Ugrica collection and Alto XML files per book, which are made available for editing in Revizor, the text editor for enhancing the data.

Continue reading →

Brief technical overview of Revizor, the editor for correcting OCR text material

Posted on 30.4.2015 by Jussi-Pekka Hakkarainen

2,304

OCR’ed text is often lacking in quality because of errors during the optical recognition process, especially when the source material is old or otherwise in a bad state. These errors make it hard to rely on the text for building a corpus or word lists and makes the source material less accessible to use for study or to incorporate into other tooling for language researchers. This is a problem that our OCR editor tries to eradicate, or at least contribute a possible solution towards.

Continue reading →

Fenno-Ugrica

The Blog of the Minority Languages Project – National Library of Finland

Author Archives: Jussi-Pekka Hakkarainen

CIFU XII, Day 2

CIFU XII, Day 1

Congressus Duodecimus Internationalis Fenno-Ugristarum, 17–21.8.2015, Oulu

DH2015. Recap, Day 5

DH2015. Recap, Day 4

DH2015. Recap, Day 3

What Did I Learn from the DH2015 Workshops? Recap, Days 1-2

Digitization Project of Kindred Languages goes Helsinki Book Fair 2015

Post-prodcution of our digital content

Brief technical overview of Revizor, the editor for correcting OCR text material