Limitations

Table of Contents

  1. Introduction and Background
  2. Data
  3. Methods and Key Measures
  4. Results
  5. Discussion
  6. Limitations
  7. Implications and Conclusions
  8. Division of Labor
  9. References

Limitations

    • Lack of documentation hurts ramp-up so it’s harder than necessary to get started
    • Would be nice to have a guiding document on “Puhti best practices” or similar. This could avoid certain problems in access and disk usage 
    • Lack of labeled data made it difficult to quantify how good our clusters actually are, so it would be good to improve on this in the future
    • Would be good if the dataset could be moved to an actual Database Management System and having examples on how to query the dataset instead of having it spread across CSV files and directories on the cluster
    • More annotated images labeled with printers needed
    • Limited metadata analysis because of missing information about actors. Clusters like cluster 34 cannot really be estimated based on their metadata