How do I use 5S method for organizing data files?

To improve your work efficiency, it is good to have a clear data management strategy and data files well organized. On this blog post we have gathered tips on how to organize files based on a 5S method. Data Cleaning Week 2019 challenges University of Helsinki researchers, as well as staff and students, to check their data management routines and organize data files.

(Tämä artikkeli on saatavilla myös suomeksi.)

Probably many of us have faced the problem that computer memory runs out or the desktop starts to look like a mess. 5S method helps in deleting unnecessary data files and organizing the remaining files – this is what the Data Cleaning Week is all about.

5S method is a five-step protocol for workplace organization that has been originally developed in Japan. The method is a lean tool, which aims to improve work efficiency and standardize protocols that will help to sustain systematic organization. 5S can be used for example in offices, production lines or IT systems. 5S method is useful for researchers to organize research data files, and everyone in the university from administrators and experts to students can adopt this method. The 5S stand for:

  1. Sort
  2. Set in order
  3. Shine
  4. Standardize
  5. Sustain

The idea is that you start by sorting your data files: check all folders and remove files that are not necessary. After that set remaining files in order; make the structure and naming conventions that suit your needs. Third and fourth is about shining and standardizing: develop a method that will support the new practices and make it part of the daily routine. Last, but not least, sustain your new system, and teach it to your group and colleagues so that good practices spread. 

More tips on our 5S-webpage!

1. Sort

This phase should reduce time loss looking for an item by reducing the number of items. Also, the amount of available space increases. Follow the next steps:

  • Check all items in a directory (folder) and evaluate whether or not their presence is useful or necessary.
  • Remove unnecessary files and directories as soon as possible. Place those that cannot be removed immediately in a ’red tag area’ so that they are easy to remove later on.
  • Keep the working directories clear of files except for those that are in use to production.

2. Set in or­der

Idea of this phase is to make the workflow smooth and easy. First, create a directory structure to suit your needs. For example if you work with sensitive data, a clear folder system helps also in access control.

Second, use clear and meaningful file naming conventions, both for folders and files, to reduce time in finding particular files.

Creating a dir­ect­ory struc­ture

When planning a hierarchical directory structure, take in consideration:

  • What sort of data will you have?
  • Are there many subprojects which need their own folders?
  • How should different raw data, cleaned data, methods, documentation, manuscripts or presentations be organized?
  • Balance between shallow and deep folder hierarchy to keep files findable.
    • Too deep needs many clicks to get to the right file.
    • Too shallow can end up having too many files in one folder.
  • Avoid overlapping categories and make meaningful folder names.
Two examples of a folder structure.

File nam­ing tips to keep in mind:

  1. Balance with the amount of elements in the name: too few making it too general vs. too many hinder understandability.
  2. Use meaningful abbreviations: data, raw, ms (for manuscript), code etc.
  3. Order the elements from general to specific.
  4. Use the underscore (_) as element delimiter and hyphen (-) or capitalizer to delimit words within an element. Don’t use special characters: & , * % # ; * ( ) ! @$ ^ ~ ‘ { } [ ] ? < >.
  5. Time should be ordered: year, month, day (YYYYMMDD or YYYY-MM-DD).
  6. For version control use the letter V followed minimum by two digits (V06), and extend it if needed for minor changes (V06-02). Remember the leading zeros to make sure files sort correctly.
  7. Write a readme-file about the naming system and explain abbreviations (example below).
  8. Make your research group & collaborators use the file naming system.

Example: Honeybee project 1, experiment 3, manuscript, 14.10.2019, version 2 could be named the following: HB1_exp3_ms_14102019_v02

3. Shine

Idea of this phase is to prevent deterioration and keep directories easy to work in. If you use standard and documented method, when in place, anyone not familiar to the environment can easily navigate in the directory, find files, and detected problems.

  • Be proactive and maintain the new organization you have created.
  • Make managing data in an organized way part of your weekly routines.

4. Stand­ard­ize

Idea of this phase is to stablish procedures and schedules to ensure the repetition of the first three ‘S’ practices.

  • Create standards and practices that suit your project needs; folder structure, file naming conventions, tags, or other standards in use.
  • Write down (document) the best practices & rules everyone should follow.
  • Discuss the practices and rules with your group, and agree that keeping data files organized is everyones responsibility.

5. Sus­tain

Idea of this phase is to sustain the developed processes by self-discipline and habits.

  • Sustain the new practices and achieved organization.
  • Organize training sessions with your group regarding your data management practices, especially when new people join the group.
  • Implement improvements and changes to your protocol, if needed.

Participate in Data Cleaning Week!

Take the challenge and participate in the Data Cleaning Week by posting a picture of your cleaning effort to Twitter by using the hashtag #5sdata. The picture can be about folder structure, new file naming system for your group, space created by cleaning – you name it! Remember to challenge your colleagues too!

You can also participate in Data Cleaning Week by sending an email to Data Support: datasupport@helsinki.fi. We convey your message through the Think Open blog, Twitter and Helsinki.fi web site.