During this week in Helsinki there has been Liber 2016 conference, which is the main network for research libraries in Europe. This year it was organized in Helsinki at Paasitorni, by co-operation of the National Library of Helsinki, Helsinki University Library and Finnish Research Libraries. There was record number of attendees, proposals submitted and also a great variety in countries where the attendees came. The theme of the conference was “Libraries Opening Paths to Knowledge”, which both gave possibilities for wide range of talks, while still keeping all presentations in linked to each other.
Digital humanities, eh?
Digital humanities was visible also in Liber, as an recognized field, which requires libraries all over to adjust their services towards those who want to utilize materials via algorithms in addition to close reading. It also came about in the discussions after sessions, and also via side discussions during the breaks. Especially on Thursday there were sessions that talked about service design, where there was conscious effort to the respond to the new needs of clients of the libraries via service design, service modeling, net promoter score metrics, which might appear more in private sector. But this shows that libraries are, opening new ways to information, as the conference tagline goes, in whatever ways are available — even small things, like having people to guide students can have huge impact to the daily high and low points of an individual researcher.
Plenary: Digital Humanities in and of the Library
The plenary session of Yale University Library / Digital Humanities Lab by Peter Leonard was particularly interesting, as he went through both in theory and practice how a library can help in digital humanities. He went through the collections but also research projects, which utilize the data.
One of the aspects discussed were the special collections & DH, where the user engagement, which some might call, crowdsourcing, is a way to both enrich the data, but also to get awareness of the materials increased. This first example talked mainly about the fascinating Transcribe-project , where Cherokee language could be transcribed via unique font set by the volunteers, who know the language. As mentioned in Twitter, this approach of targeted crowdsourcing, is similar to the well-known Fenno-Ugrica project.
— Hakkarainen (@Hakkarainen) June 30, 2016
The second case of enriching was done via kinds of crowdsourcing tagging of original materials. From the digital materials, for example from books, volunteers draw bounding boxes to surround significant terms, e.g. the characters, writers, or titles. One of the demos shown had then a network analysis of works, artists and their content.
Then for a different kind of example, the final example was about Vogue journal, where the content itself was restricted, but the lab had sought ways to still utilize the materials in an image-based analysis. For example we saw how the front page of the journal changed over the years and how the hue and brightness varies. From the vogue materials there was also made “frequency counts” of terms which were visualized in a timeline. Via topic modelling it was then also possible to let the content to speak to the researchers, as that created groups of terms which seem to appear close to themselves and thusly created groupings of terms, themes, which then researchers can dig deeper. The access controls were utilized as the system provides also the full text of articles for those who have access to the restricted content, others have to work on the aggregate level. This is bit similar to the NLF’s Aviisi-project goals, where access control has been a key thing to develop, in order to increase availability to the more recent materials.
It was understood by library people that “writing software” is one of the key skills, which a researcher might need. Actually in the discussion of the Yale’s plenary session there were considerations that how goal-oriented those skills should be? The comparison given was related to e.g. language courses, they might be taken at some point, but even if the learner doesn’t have passable language skills for everyday discussion, it still can be seen as valid learning as e.g. culture and other knowledge is got.
Especially in programming the software stacks, languages, different toolings, arise and fall quite rapidly and one of the core skills could be just to juggle with many possible programming language, in the selection of the most suitable tool for a job. As mentioned in Liber2015 in prof. Tolonen and Dr Leo Lahti, there should be reproducible workflows from data, to code and to results.
For example in the open science portal there just came text reminding of citing software, computer programs should be done, which is talked more, for example in the Force11 software citation principles . Personally, I’ve always liked if a github repository contains also the citing instructions directly, because then it is relatively convenient to just copy the given format. For example, Mike Jackson lists examples how to get citation from various tools, e.g. from R modules via citation(‘modulename’) and finally the recommendation to do a own command line option for citation, is actually very implementable suggestion. In a way generating an URN to a blog post would be possible, but would it be useful? Especially if using github for blogging it could be possible to just get an URN to a post automatically – then it would happen behind the scenes but anyhow be available.
On Friday there were talks about training DH, where e.g. Maija Paavolainen from University of Helsinki told about the current digital humanities course in Digital Humanities Helsinki and what students, content providers, and trainers had learnt throughout the way. National Library of Netherlands also talked about the library role in DH, it all starts with data (digitized), but there was also good reminder to be critical towards all tools, and to the digital sources in general.
— LIBEReurope (@LIBEReurope) June 30, 2016