Deep learning and digital materials

On 4.5.2017 there was the Finnish Academy first Annual seminar with topic “Deep Learning & the Humanities”.

The opening words were given by Kia Lindroos chair of Digihum steering group, who mentioned as goals of all the funded projects

  • Examines digitalization as cultural and social phenomena
  • Employing open, multiform and/or real time data in research
  • Usability and awareness of dataset – not just possibilities today, but investigate possibilities of co-operation

She also mentioned about the good progress of Finnish projects by mentioning all of the projects, which were funded in the international Digging into Data challenge.

Professor Roger K Moore from University of Sheffield gave superbly interesting keynote on application of machine learning with regard to the speech recognition techniques. The keynote covered the past, but also way forward, as a way to connect humans and machines. The title of the talk was “Bridging the Gap Between Humans and Machines: Lessons from Spoken Language”. There seemed to be some familiar traits in speech as is in the OCR-mistake-rich old newspapers. Language (especially spoken) is incredibly variable; context-dependendent, relationship with behavior, environment – all these things influence understanding. Human being can learn as it goes, but configuring machines to do the same requires quite much of research and work. Interestingly, one solution seemed to be incremental speech recognition, going beyond stimulus – response cycles to a more evolutionary model, where the speech recognition is done piece by piece – then machines have better chance to catch the spoken word, or even ask for clarification if something went by.  Some of the material referenced at the key note can be found from this list.

Digitalia, newspapers, Comhis


Part of the seminar was also the presentations of all the projects who are part of the implementing the Digihum-program. E.g. Dr. Kimmo Kettunen presented Comhis (& Digitalia) in his presentation with the work done with historical newspapers who highlighted for example the importance of getting the articles extracted and going beyond the page level content, which is being researched now.

There were also presentations from the all of the projects, which got funding in the Digging in the Data -challenge, which were listed earlier. This collaboration can bring together collaborations from universities from different sides of the pond, and give good boost also to the Finnish projects, which were well represented all in all.

All in all, the digital humanism field seemed to have lots of interesting activities ongoing in different universities. Hopefully sessions like this bring working close together, when information is shared from one group to another. However,  the visualizations require also deep understanding of the data below, just recently there were some research done, how visualization can be modified on a dataset, even if the statistical properties stay the same.