Project standard operating procedures
1 Protocols
Here are the standard operating procedures to follow when performing a genome assembly, annotation, and/or further analysis.
1.1 Why do we need these protocols?
- To make data findable - (strict folder structure)
- Ease project tracking - (git)
- Reduce workload - (automation, code sharing)
- Reproducibility - (workflows, notebooks, git, documentation, containers, interoperability)
- Documentation - (reporting, summaries, issue tracking)
1.2 Quick Start
Make a private Project repository from this template repository on Github.
- Click the green
Use this template
button on Github in the upper right corner. - Check
NBISweden/assembly-project-template
is selected asRepository template
. - Check
Owner
isNBISweden
. - Provide a repository name following
<project>-<species>-<year>-<short_description>
where<project>
:VREBP
: For VR-EBP projectsERGA
: For ERGA projectsBGE
: For BGE projectsSMS
: For NBIS short term projects
<species>
: Species name<year>
: Year project started<short_description>
: Short project description.
- Ensure repository is private, then click Create repository.
- Click the green
Clone it into the NAISS Storage project.
cd /proj/snic2021-6-194 git clone git@github.com:NBISweden/<repo>.git
Update README in the repository with project details.
Add references to references.bib of important information.
Copy NGI deliveries to data folder.
Link relevant raw data in
data/raw-data
.Update
assembly_parameters.yml
to point to files indata/raw-data
.Run analyses, activating any necessary compute environments.
Refer to the other pages here for more in-depth descriptions of the protocols.
The template provides an organised folder structure, and skeleton files to quickly start analyzing.
Analyses are primarily run on Uppmax. Github is used as the primary repository, and analysis files should be tracked and pushed regularly.