29-Oct-2024
Raw data | Metadata | |
---|---|---|
Data acquisition | Data arrives in cumbersome and proprietary format | In researcher’s lab journal |
Analysis | Gets converted to format of choice. Original files (and conversion settings) are lost | Hard-coded in various analysis scripts |
First submission | Mailed back and forth between collaborators in ever-changing (but nicely coloured) Excel sheets | |
Review | Leads a quiet life on the HPC cluster, until the project expires and the data has to be urgently retrieved | |
Second submission | Ends its days on an external hard drive on the researcher’s desk | Reformatted and included as PDF in the supplementary |
Publication | “Data available upon request” |
Strive to make your data FAIR1 for both machines and humans:
Why Open Access?
Which sample file represents the most up to date version?
The first step towards working reproducibly: Get organised!
A simple but effective example is the following:
A Snakemake-based example: snakemake-workflows/template
A Nextflow-based example: fasterius/nbis-support-template
Working in an HPC over SSH in the command line: