Determining the best data storage solution for your research can be cumbersome. To make the selection process easier, University of Helsinki researchers can now use the Data Storage and Sharing Table, which describes the various options available for storing research data. This blog post explains the criteria for choosing your research data storage solution.
Text: Sebastian Porceddu
Getting an overview of the wide variety of data storage solutions available at the university can be confusing. We are constantly thinking about new ways to make information about these solutions available to our researchers and other university staff in a helpful way. Our latest addition to this tool kit is the Data Storage and Sharing Table, which covers the features of the main storing solutions.
In the table, storage solutions are organised to three overlapping categories based on the user access control features of the service:
- Data storage solutions suitable for personal storage (i.e. only one user may access the data).
- Data storage solutions suitable for groups of UH internal users.
- Data storage solutions suitable for easy data sharing to external collaborators and wider public.
Other main features and selection criteria for the storage services are displayed in the columns:
- Backup and/or version control is included in the service (yes/no).
- Maximum capacity for the storage solution.
- Can the storage solution accommodate sensitive data?
- Cost of storage solution.
Version control, backup and sensitive data
The difference between backups and version control may not always be clear. Version control means that the storage system includes a ‘recycle bin’ for previous file versions. These file versions are available in case the file is deleted, or you need to check an earlier version. However, the previous versions are usually located in the same storage space as current files and are therefore not protected from the collapse of the entire system. A full backup means that a technically independent duplicate of the same data exists in another location.
Since most of the network and cloud storage solutions offer a level of version control, the challenge in choosing a suitable storage solution is often to negotiate the right balance between capacity, accommodating sensitive data, and cost. Cloud services offer the most cost-effective (or even free) way to store and share large files (Teams, Onedrive etc.), but these solutions are not suitable for sensitive data.
Tailored data storage for large data sets
Small sets of sensitive data are relatively easy to manage in the regular university network folders, but large sets present a challenge. If you need to store and manage large sets of sensitive data, the optimal solution probably requires tailoring on a case-by-case basis by comparing the price and manageability of the systems that the university IT Center or CSC – IT Center for Science can offer on demand. With tailored data storage solutions it is always helpful if you already have a clear concept of the workflow, and whether it is possible to e.g. anonymize big data and apply stricter restrictions only to smaller sensitive data sets.
Find the optimal storage solution for your data with the Data Storage and Sharing Table. If you need any assistance with selecting the data storage solution for you research needs, please do not hesitate to contact us at: firstname.lastname@example.org.
Sebastian Porceddu works as a Solution Consultant at the University of Helsinki IT Center.