Open science in my PhD

”Science should be transparent and accessible to everyone. Today, I firmly believe that science goes hand in hand with openness. When I started my PhD couple of years ago, I did not even know what open science meant. Since then, I have taken baby steps towards a more sustainable science culture.” In this blog post, University of Helsinki doctoral student Julia Kemppinen writes on a practical level how she learned to understand the importance of open science and how she implements open science practices in her research.

Text: Julia Kemppinen (TUHAT, ORCID, @juliakemppinen)

I became familiar with open science when I took my very first steps in science. I was investigating the spatial and temporal patterns of soil moisture [1]. Our BioGeoClimate Modelling Lab had a huge amount of field measurements, but our spatial predictions required detailed information of the terrain. Luckily, few years earlier the National Land Survey of Finland (NLS) had begun a highly ambiguous and expensive project of mapping the entire Finland using air-borne laser scanning. The outcome was a high-resolution digital terrain map, which they would distribute online for anyone to use.

Collecting extensive soil moisture data is time consuming and a lot of hard work. That is all the motivation I need to share my data openly. I want the data to reach its full potential! / Photo: Pekka Niittynen.

The openly available data was my first encounter with open science. However, I was not yet convinced. Why I should make time and put effort in sharing my science openly? I was busy with my newly begun PhD and I did not think about open science that much, or how it related to my responsibilities to science, the scientific community or the general public. I was just a doctoral candidate figuring out her way through everyday problems with data, software and literature.

In the years to follow, I realised that the records I used for my research, the programs I used for coding and the articles I used as references were all part of the open science culture. I was part of this, whether I actively promoted it or not. I had benefitted from this culture of openness in every step of the way. Someone somewhere had made the decision of making their data, software and literature openly available for me to use it and build new knowledge based on it. I was struck by awe. Finally, I had understood the value in open science.

In the years to follow, I realised that the records I used for my research, the programs I used for coding and the articles I used as references were all part of the open science culture. I was part of this, whether I actively promoted it or not. I had benefitted from this culture of openness in every step of the way.

In my second article, I took another step towards open science [2]. After the article got accepted, I uploaded the data of hundreds of records of plant, moss and lichen species as well as several field-measured environmental factors on an open data repository [3]. This was due to two crucial factors. Firstly, uploading the data did not cost me anything. Secondly, sharing the data was a condition for publishing. These two factors – free of charge and force – had transformed me from an open data user into an open data contributor. Once this step had been taken, there was no turning back.

It was empowering to realise that anyone can do open science, even a PhD student juggling somewhere between collecting data and submitting manuscripts. How cool would it be to have someone using all the data and reading all the manuscripts that took oh, so many years of work! I realised that I was in control of the visibility of my data and research, and if I wanted other scientists to find my work, I should actively promote my open science.

These two factors – free of charge and force – had transformed me from an open data user into an open data contributor.

Currently, my third article openly available for anyone to read [4]. This is possible, because I submitted it simultaneously to a journal for peer-review and publishing as well as to a pre-print server for anyone to access it. The latter makes me very happy, because it breaks my heart that not everyone has access to scientific literature.

It is a slippery slope! One minute you are using someone else’s open data, the next minute you are uploading yours to an open data repository and looking for a pre-print server. / Photo: Julia Kemppinen.

I have gotten article requests by scientists from institutions without access, but also by non-scientists, for instance from the nature conservation sector. Whom I wish to read my work, if not my fellow scientists and non-profit organisations, who work towards understanding and preserving what I value the most? Yet, I have not published in open access journals. I guess balancing between open science and career planning is one of the classical dilemmas of a modern-day scientist.

”If there were more factors in the merit system, we would not need to balance between open science and career planning.”
(Anonymous peer review comment on this blog text)

I am defending my thesis in the near future, and my next step is still one big question mark. I know that my work will be evaluated by the impact it has, not by its openness. When it comes to selecting a journal, I have chosen the one with the highest impact and the perfect scope, even if my institute did not have an open access agreement with the journal. I wish I could have done more open science during my PhD, but it would have probably cost me more than just time and effort. I guess I could have applied for even more grants, but I chose to prioritise research equipment, field work and conferences over publication costs.

It is funny to look back. During these past three years, I have taken fairly small steps towards open science. These steps are so easy that I am planning to repeat them in all my upcoming projects:

  1. publish data openly available,
  2. use plenty of high-quality open data, and
  3. upload a pre-print as soon as possible.

Maybe an additional step would be:

  1. reflect on your choices, are you living up to your values?

In my fourth and final PhD article, I have already taken the two first steps: I have contributed soil temperature data to a global database (SoilTemp), and I am using 43 000 records of plant traits from open databases (Tundra Trait Team, TRY, BIEN). The latter is crucial for my work, since I have not collected even one trait record. Someone somewhere had made this possible.

Literature

1. Kemppinen et al. (2018). Modelling soil moisture in a high‐latitude landscape using LiDAR and soil data. Earth Surface Processes and Landforms.
2. Kemppinen et al. (2019). Water as a resource, stress and disturbance shaping tundra vegetation. Oikos.
3. Kemppinen et al. (2019). Data from: Water as a resource, stress and disturbance shaping tundra vegetation. Dryad.
4. Kemppinen et al. (Under review). Woody plants constructing tundra soils. bioRxiv 789743.


Julia Kemppinen (TUHAT, ORCID, @juliakemppinen) is a doctoral student for geosciences at BioGeoClimate Modelling Lab at the University of Helsinki.