Notes on Digital humanities hackathon 2016

You know it is nearly summer, when the digital humanities hackathon appears to the calendar. This year it was organized on 16–20 May 2016 together with Aalto university and Helsinki university. The hackathon is the culmination of the digital humanities studies, as a project work that combines students from various fields to think a research question and possible answers with the available materials.

Continue reading

Futurehack Hackathon at MAMK

Between 23rd-24th of April, MAMK organized Hackathon that continued the data seminar about combining AI and wellbeing on 22nd.

 

20160424_142706

The challenges were

1. Medical imaging

2. Personal health and well-being

3. Creating value with professional medical data

4. Own track
At this Hackathon, all of the teams had project to solve real problems. The winning teams had most focus on developing tools for the challenges and had working prototypes with nice graphical designs.

 

1st team:

AI.MD (Artificial Intelligence for Medical Data)

 

20160424_162217
Developed tool to teach IBM Watson to recognize x-ray images (e.g for knee detection) by using Gamification.

 

2nd team:

Coding Bad

Developed Health-assistant bot for Telegram, that you could communicate with text or speech and send image for the bot and it used Watson to detect you age/gender etc.

 

3rd team:

Team Schrödinger

Developed way to detect if x-ray image quality is good or bad.

 

Other interesting team Fear Catchers were detecting emotions of user from multiple sources at same time by using many of the IBM APIs (Face recognition, Tone analyzer…etc) and then use that information to calm down the person (for example during presentation). Many teams also were trying to solve problems with obesity and diabetes (food diary, food imaging and mobile controlled wireless insulin injector). There was also working prototype Kettlebell that spoke (using speech synthesizer) to you how to use correct technique during workout.

 

 

IBM Bluemix API use example:

Tried one the APIs to recognize images. There is simple curl commands you can use to access the data (with your own APIkey)

curl “https://gateway-a.watsonplatform.net/calls/url/URLGetRankedImageKeywords?url=http://tethys.lib.helsinki.fi/pienpainate/binding/342795/thumbnail/1&outputMode=json&apikey={MyAPIkey}”

gave answer:

{

    “status”: “OK”,

     “usage”: “By accessing AlchemyAPI or using information generated by AlchemyAPI, you are agreeing to be bound by the AlchemyAPI Terms of Use: http://www.alchemyapi.com/company/terms.html”,

    “url”: “http://tethys.lib.helsinki.fi/pienpainate/binding/342795/thumbnail/1“,

    “totalTransactions”: “4”,

    “imageKeywords”: [

        {

             “text”: “person”,

             “score”: “0.960834”

        }

   ]

}

If you got interested, try the IBM APIs and register for free trial at here

Dataseminar at MAMK

On 22nd of April, MAMK organized a dataseminar about how data, AI and technology can increase well-being.

Keynote about emerging Watson technologies

Keynote by IBM’s Vince A. Daukas, went through the problem with current data amounts, the basics of the solution and how IBM’s Watson and the system of services can respond to different usage purposes. One of the present facts is that the amount of data increases all the time, and it was mentioned that

2.5B gigabytes of new data are generated every day. Appr 80% of data is unstructured.

IBM also sees that there is a new era of ‘cognitive computing‘ (some info also in Finnish). The idea is to simplify processing of mass amounts of data, e.g. via machine learning to reduce need for programming and make interactions more human-friendly.

IBM’s Watson system was also extensively gone through – it incorporates concepts of technology, platform, solution. From the examples given, it seemed that (nearly) every kind of data can be simplified, referenced and utilized by it more better. Like in the case of the National Library of Finland, reading of all of the 10 million digitized pages would take beyond 10 years, the Watson can capture all of the research article, patents, and what ever content is seen significant and create new connections.  At the Q&A, this was also later opened bit futher – the data stays at the content server, but Watson ingests data all the time, to capture new ones, and if needed rankings and prioties can be tought.

Watson capabilities (2016)

In a way, also AI technologies have the very familiar concerns, too. It is important to know what the technology can do, and how (and probably where) it can be applied. As it is a business, then it needs to be evaluated what is the ROI and high-value use cases, into which research or effort is put. Training routines for both people and machine requres thinking and the perceived risks with data privacy, cloud and AI need to be governed.

data > info > knowledgeWatson and health

Jukka Rupponen then went deeper with Watson from the angle of health- and wellness. He began by startling the audience with a (sad) fact:

$7.8 trillion is expended annually for health and social programs around the world. Up to 30% of all that money is wasted.

There was also consideration on division of labour as usual in the automation cases, namely that that humans, computers and cognitive systems should each focus to the area where they work best.

Watson Developer CloudThe idea is that the app developer starts with their question and who then can use the many services offered, like IBM Watson Developer Cloud, Discover Advisor, which were mentioned as examples. For the developers the important links were given: https://console.ng.bluemix.net/  and  http://github.com/watson-developer-cloud  to get actually started with the services.

Finnish examples – Disec & Kuopio

Health information gamification was talk from the Disec, who takes care of storing the medical images. Annually there comes around 700.000 new images and thElävä arkistoe storage time is at minimum 12 years. When we at library deal with standards like METS, ALTO, PAS, in the health side there is HL7, DICOM and on top of that all the health care systems, which either manage the workflow or are the multitude of tools to the doctors.

As an interesting contrast the speaker mentioned of a research from 2015  where pigeons were trained to observe anomalies from images. Apparently after a week they can detect minute changes like calcifications from mammogram images quite well. This caused some remarks as then the human, watson and pigeons were playfully compared to each other.

Kuopio Innovation talked about the gamification and healtGames for healthh – the speaker saw that as real opportunity for Finland. The idea was to think from user in the center, and then via gamification find ways to activate, educate and overall find ways to improve life quality.

The Q&A session was also quite lively, the audience had lots of subject matter experts, who were able to get also in the details based on lectures of the presenters. As said by the HackLab tweet:

It sounded like that this hackathon is just a kick-off to longer-term work, so as discussed at the end of the event, the idea was to start from one small idea, and do what is possible in two days and continue later on. One idea can act as a wedge and build up to new things. The developers will surely be interested of all the building blocks and apis of IBM services provide.

 

P.S. The session was also covered by web article and  local news (the TV clip link is valid until late May).

Digital humaniora i Norden, aka DHN 2016

In the mid-March , there was the title’s conference ‘Digital humaniora i Norden’ at Oslo, Norway. For the program there were 125 proposals for presentation and posters, where 79 and 13 were accepted. From the presentations from Finnish organizations, there were  13 presentations, consisting of 1 session, 10 papers and 2 posters.

Viribus unitis

Tekniska föreningens i Finland förhandlingar, 01.01.1880, no. 1, p. 6 at http://digi.kansalliskirjasto.fi/aikakausi/binding/1131906/articles/ 1843678#?page=6 National Library’s Digital Collections

Type Title Organization
session Ett komplett arbetsflöde för två digitala utgåvor Svenska litteratursällskapet i Finland
paper Narrative Approaches to the Digitalization of Participatory Urban Planning: Bringing Plot and Metaphor to PPGIS methods University of Tampere, Aalto University
paper Finländska klassikerbiblioteket – finländsk litteratur i det digitala landskapet University of Helsinki
paper Nordic Englishes on Twitter University of Oulu
paper Historians digging in the text mine. Exploring blended close & distant reading of technical journals to understand Finnish history of industrialization, 1880-1910 Aalto University, University of Turku
paper Counterfactual history in simulation video games: a methodological approach to studying historical consciousness University of Helsinki
paper Dealings with Uneven Corpus – Experiences from the Use of a Difficult Research Data University of Helsinki
paper Assessing lexical quality of a digitized historical Finnish newspaper collection with modern language technology tools National Library of Finland
paper Unlocking a Finnish Social Media – In Search of Citizen Mindscapes University of Helsinki
paper Travelling TexTs: A HERA financed five countries project from the point of view of its Finnish team University of Turku
paper Teaching Digital Humanities at the Faculty of Arts at the University of Helsinki University of Helsinki, Aalto University
poster Classical Intertextuality in Late Greek Poetry: a Computational Approach University of Helsinki
poster SEMATIA: Linguistic Annotation of Greek Papyri University of Helsinki

 

 

 

 

 

National Library of Finland at DHN16

NLF also was presenting at DHN16, with talk about “assessing lexical quality of a digitized historical Finnish newspaper collection with modern language technology tools”, for this we went through the material within http://digi.kansalliskirjasto.fi texts and evaluated those with both Omorfi and FINTWOL. We tried to find the most often used words and check their status, to help, in the long run, finding ways to improve the text quality. After running the relatively easy-to-use, that the range of the quality of the digi varies:

that the collection has a relatively good quality part (at least 1/3, probably up to 40–50 %) and a very bad quality part (at least about 10–20%)

 

These metrics can be useful in evaluating whether the experimented corrections take the whole collection to the good direction. In improvements we are thinking the usefulness of the material to the researchers, as there are attempts to do further analysis of the materials in different fields.

 

University of Helsinki at DHN16

UH was also well presented in the conference in Oslo. The system, known as Klassikkokirjasto or Klassikerbiblioteket was presented, and how the new web system could serve many different kinds of people from researchers to general public. Klassikkokirjasto was built in collaboration between National Library of Finland, University of Helsinki and Department of Finnish, Finno-Ugrian and Scandinavian Studies.

Anssi Kanner had also utilized the digital collections of National Library of Finland, and he talked about how to deal with uneven corpus, and the challenges which Kettunen et. al. were measuring and how he had approached some fix ideas.

There was also presentation from Aalto University and University of Turku about the using technical journals to understand Finnish history of industrialization, 1880-1910. All technical journals have not been digitized by NLF, but there was still enough in e.g. the digitized Swedish-language Finnish engineering journal, which enabled them to check the feasibility to go forwards and do first experiments. Digitization of historical materials can then seen to have increase the materials versatility for research, beside using the analog materials.

Digital humanism is also in the rise in Finland. This is expressed also by the new digital humanism curriculum, which was the topic of the paper of Mikko Tolonen, Maija Paavolainen (UH) and Eetu Mäkelä (Aalto). As they say “Open access, and open science are a core principle of all of our DH activities”. This ideology is being implemented to teaching via Helsinki DH centre (HELDIG) co-operation and collaboration outside UH, and finding ways to integrate DH research within. For example, the incoming Digital Humanities Hackathon 2016, is also one example case of this ongoing work.

More information

The full papers can be found via links in the conference program. In addition, the book of abstracts give a good summary of all the talks in one go.

 

Väylät digitaalisiin aineistoihin

Joskus saamme Digitointi- ja konservointikeskukseen kysymyksiä siitä mitä kautta digitaaliset aineistot ovat käytettävissä. Kuten yleensä tiedetäänkin, digitaaliset aineistot (sanomalehdet, aikakauslehdet ja teollisuuden pienpainatteet) ovat saatavilla http://digi.kansalliskirjasto.fi-verkkopalvelusta aina vuoteen 1910 asti.

Kuitenkin eri yhteistyöprojektien myötä digitaalisia aineistoja on saatavilla myös Fin-Clarinin, Aviisi-projektin ja kansainvälisten palvelujen kautta saatavilla hiukan eri kohdeyleisöille, tutkijoista kansalaisiin. Lisäksi ei pidä unohtaa vapaakappalekirjastojen erityistyöasemia. Tässä tekstissä käydään lyhyesti kaikkia näitä palveluita läpi.

Digitaalisten aineistot saatavissa
Continue reading