Data Cleaning Day is organized for the first time at University of Helsinki today, Thursday the 23rd of May. The idea of this day is that researchers would check the content of their data folders and the quality of their data management. We asked evolutionary biologist Jonna Kulmuni how does she keep her data in order.
(Tämä artikkeli on saatavilla myös suomeksi.)
Researcher, docent of evolutionary biology, Jonna Kulmuni (Research portal, ORCID) studies speciation in ants using hybridizing wood ants (Formica rufa –group) as a model system. The research group tries to understand how natural selection acts on genes and genomes. In addition of real samples, which are familiar to natural scientists, the project produces a lot of genome data and analysis results. Therefore, managing research data is a fundamental part of Kulmuni’s work.
Due to Data Cleaning Day, we asked how does an evolutionary biologist keep their data in order?
How often do you clean up your data?
“Too seldom. I have never thrown any samples away and the same has been my mentality towards data files. Now, that I have done research for 15 years there starts to be a lot of files”, tells Kulmuni.
“I have tried to organize files well from the beginning, but as new students and collaborators join the project, it becomes more important to keep everything in order. In January I organized all files again to a hierarchical order.”
How do you clean data files?
“Mainly I just organize files. It would be interesting to get another opinion how this could be done.”
(Tips how to start data cleaning have been gathered to Data Cleaning Day wiki- page, where five phases are went through: sort, set in order, shine, standardize, and sustain. Moreover, information and tips for data documentation can be found from Guide for data documentation.)
Has there been benefits of cleaning data files?
“Working is much faster as files can be found quicker.”