The program has the following main characteristics:
-
at each of the 5 days present a module relevant for DS and DM
-
each module will be introduced by a 2-hour slot of disseminating knowledge and principles followed by practical sessions where participants get hands-on experience
The modules can be summarised as:
-
Proper Repository: how to organise and assess a proper repository with the help of a professional repository application (DSpace: http://www.dspace.org/, https://github.com/ufal/clarin-dspace
-
Registering Environmental Data: upload environmental data into the repository, create metadata, assign augmented Persistent Identifiers (PID: https://www.handle.net/, http://www.pidconsortium.eu/)
-
Collection Building and Using: create collections as subject of analysis, expose metadata, cite collections, etc.
-
Data Typing: use data typing as an essential element to carry out transformation, visualisation and analysis via a Data Type Registry (DTR: https://www.rd-alliance.org/group/data-type-registries-wg/post/data-type-registry-first-prototype.html)
-
Analysing Data: addressing the stored data and metadata via PIDs for analysis using BEAKER notebooks (http://beakernotebook.com/), Python and R (https://www.r-project.org/)
Programming and scripting will mainly done in Python and the tools to be used (D-Space, Handles, DTR, Beaker, R, etc.) are open source and can be used in the course.