Meet Mildred 2: The data fridge for data chefs

An essential part of Project Mildred is building technical infrastructure for data services. This is carried out in the Mildred’s Sub-projects 2 and 4, under the co-ordination of the IT Center. Mildred 2 is responsible for building the Data Repository Service, which will help researchers manage, share, and store research data. The service looks like a website, but researchers can also use it via their own applications and file systems.

Ville Tenhunen and Minna Harjuniemi in Viikki, Helsinki. Photo by Jussi Männistö.

The repository service is divided into two parts: the implementation of EUDAT (European Data Infrastructure) tools and the building of our own repository. The EUDAT tools are for storing, sharing and collaborating data during the research process, as well as for publishing and describing finished datasets. The University of Helsinki’s own repository provides tools for researchers whose data are not suitable for a cloud service like EUDAT’s.

“The EUDAT pilots began in the autumn of 2017, and the tools are to be implemented this year. The university’s own repository service is to be technically tested, and it will be ready, in some form, by the end the year,” promises Mildred 2’s Project Manager, Ville Tenhunen.

Already now, researchers have access to a variety of data services outside the University of Helsinki. Project Mildred aims to exploit existing services and link them to the research process in the best possible way.

“We don’t need to step on other players’ toes here. In some disciplines, it’s clear that research data will be stored in an international, rather than a national or a local data repository. On a national level, CSC (IT Center for Science) provides long-term storage, but we are heading in a different direction. Of course, we understand that some valuable research data require long-term preservation, but we don’t want to enter too deep into the problematics of this,” says Mildred 2’s Project Owner, Minna Harjuniemi.

When we talk about data, we must not forget metadata. Metadata and data are closely integrated and cannot be separated, especially in data management, which covers the entire life cycle of research data. Project Mildred deals with metadata in both Mildred 2 and Mildred 3. However, Mildred 2 is primarily about data.

“Mildred 3 introduces data story-telling [based on metadata], which is now a red-hot topic. Mildred 2 only makes the data available, and we’ll ensure that the data for data stories are at hand when needed,” explains Ville Tenhunen.

“You can compare our work in Mildred 2 to cooking. We give you the kitchen and the ingredients. It’s someone else’s task to prepare the food according to the recipe,” says Minna Harjuniemi. Ville Tenhunen continues:

“Mildred 2 is like a fridge, into which researchers put their carrots. Then the chef comes in and does something with them. We only supply a fridge. But there are many kinds of fridge. Fridges have different degrees of coldness, for example, and we can say ‘don’t put your fish in that one, put it in this one.”

The data fridge, with its related data services, is created specifically for the needs of researchers at the University of Helsinki. At present, research data are stored everywhere; here and there, in the most diverse places (see Data Repository Survey).

“Until now, files of a certain size have been difficult for us. For example, 1–2 terabytes of data are too large to fit into existing systems at a reasonable cost, and too small for a more extensive repository system. A lot of data fall in between. Researchers with 300–400 terabytes of data are easier to handle, because they clearly need special solutions, and they have the money and expertise. Also, a small amount of data, such as 20 gigabytes for example, easily fits into Wiki, or almost anywhere, in fact,” says Tenhunen.

Ville Tenhunen, Minna Harjuniemi and Jussi Männistö. Photo by Juuso Ala-Kyyny.

Even when the repository services are technically finished and released, the work is not yet complete. Project Mildred represents a new kind of service thinking, in which the service provider and the customer work together to develop the service.

“Social interaction isn’t achieved by an administrational organization, such as the IT Center or the Helsinki University Library, producing a lot of material. We can’t say ‘Here’s the sandbox and some nice rules for you, off you go and play’. This won’t lead to social connectedness in 2017. Users have to be involved, and the service provider has to be part of the community. Communication and feedback channels can be used to respond to situations and to share information with users. The world is changing fast, and services need to change too. Competitive advantage is based on our ability to respond to these changing needs,” says Ville Tenhunen.