Meet Mildred 4: Storage for big data researchers and safety for all

The technical infrastructure of the Mildred services are built at the IT Center. Whereas Mildred’s Sub-project 2 was primarily concerned with data storage, sharing and management, Mildred’s Sub-project 4 is mainly related to data storage (capacity) and backup.

Ville Tenhunen and Minna Harjuniemi in Viikki, Helsinki. Photo by Jussi Männistö.

The goal of Mildred 4 is to ensure that researchers can flexibly increase their data storage capacity and preserve their data in the event of potential technical problems. These data storage and backup services are important to all who use data storage services, but especially to researchers and research groups with high data volumes.

“We are building a storage that suits big data researchers. Researchers with smaller data can use the services that built by Mildred 2. Mildred 4 storage is useful for researchers who have hundreds of terabytes of data. The target group is clearly in data intensive research,” explains Mildred 4’s Project Manager, Ville Tenhunen.

“The data storage built in Mildred 4 is perhaps related more to the data management during the research process. When the dataset is ready to be published, Mildred 2 and Mildred 3 services come into play,” says Mildred 4’s Project Owner Minna Harjuniemi.

Piloting is already complete: Ceph and GlusterFS have been tested for Mildred 4 data storage. The purchases should take place in the autumn of 2017.

The data backup service is basically now ready to be bought. Several Mildred services are being piloted by researchers this autumn, but the backup service does not need to be tested in the same way. Technical piloting is enough for “invisible background service”.

“No piloting is needed for the backup service because it’s a well-established standard service,” says Tenhunen.

According to Minna Harjuniemi and Ville Tenhunen, the Mildred 2 and 4 process has revealed no real surprises as regards content issues.

“We still feel we’re on the right track and that we’re doing the right things for both society and the planet. It’s a good idea to get involved in data sharing, and it’s a good idea to promote it and provide researchers with tools for it,” says Minna Harjuniemi.