What did I learn from the second day of CIFU XII? Two things at least: for a linguist layman like me, I found it interesting to follow how differently the language documentation may be defined. As a librarian, I was thrilled to see that the people in this field are taking archiving seriously. These are the topics I want to grasp in this blog entry too.
Author Archives: Jussi-Pekka Hakkarainen
CIFU XII, Day 1
So, the 12th International Congress of Finno-Ugric Studies has finally begun. Despite the fact that Mr. Harri Mantila implicated that the congress has become somewhat tinier than before, we are pleased to enjoy about 111 long papers and 195 presentations in 19 symposia. The CIFU XII has around 380 participants from 21 countries, so I wouldn’t consider this event as a small rendez-vous at all.
Congressus Duodecimus Internationalis Fenno-Ugristarum, 17–21.8.2015, Oulu
This is probably once in a lifetime experience: I am actually excited to come back to work from my summer holidays. My eagerness is due to the 12th International Congress for Finno-Ugric Studies, or CIFU XII, which is about to take place during this week at Oulu, Finland.
DH2015. Recap, Day 5
I had spent four days in DH2015 and I hadn’t really chosen the sessions as a historian or a philologist in me would have wanted. No, there wasn’t anyone in my organization, who would have prompted me to participate any precise session in particular, but when going to the conferences, I tend to attend the sessions, which could provide some new information for my home organization in return. By intention, I chose the sessions of the last day according to my own interests and finally I was picking cherries too.
DH2015. Recap, Day 4
Well, well, well. The fourth day of DH2015 was packed with intensively discussed debates, which must be taken into consideration in following editions of this fine event.
DH2015. Recap, Day 3
So, the after the workshops, the DH2015 finally took off. These are my observations from the first conference day.
What Did I Learn from the DH2015 Workshops? Recap, Days 1-2
The DH2015 is taking place during this week in Sydney, Australia. Digitization Project of Kindred Languages will be present here as I was enabled to have a long paper on Nichesourcing of Uralic Languages later this week. Yesterday and today, I was attending the pre-conference workshops. This is a brief summary on my experiences in three workshops.
Digitization Project of Kindred Languages goes Helsinki Book Fair 2015
Digitization Project of Kindred Languages will be present at the Helsinki Book Fair 2015 this autumn. The Book Fair will take place between the 22nd and 25th of October at the Expo and Convention Centre in Helsinki.
Post-prodcution of our digital content
Anis Moubarik, an information system specialist at the National Library and a member of DPKL team, will introduce you to that procedure what happens to a digitized book in our post-production processes. During the project, Anis has been in charge of creating both, OCR’ed PDFs that are available in our Fenno-Ugrica collection and Alto XML files per book, which are made available for editing in Revizor, the text editor for enhancing the data.
Brief technical overview of Revizor, the editor for correcting OCR text material
OCR’ed text is often lacking in quality because of errors during the optical recognition process, especially when the source material is old or otherwise in a bad state. These errors make it hard to rely on the text for building a corpus or word lists and makes the source material less accessible to use for study or to incorporate into other tooling for language researchers. This is a problem that our OCR editor tries to eradicate, or at least contribute a possible solution towards.