Project standard operating procedures
1 Running assembly projects
If you’re new to these protocols, please see the onboarding material first.
1.1 Quick Start
Make a private Project repository from this template repository on Github.
- Click the green
Use this templatebutton on Github in the upper right corner. - Check
NBISweden/assembly-project-templateis selected asRepository template. - Check
OwnerisNBISweden. - Provide a repository name following
<project>-<species>-<year>-<short_description>where<project>:VREBP: For VR-EBP projectsERGA: For ERGA projectsBGE: For BGE projectsSMS: For NBIS user-fee projectsLTS: For NBIS peer-review projects
<species>: Species name<year>: Year project started<short_description>: Short project description.
- Ensure repository is private, then click Create repository.
- Click the green
Clone it into the NAISS Storage project or your folder on NAC.
cd <project allocation> git clone git@github.com:NBISweden/<repo>.gitUpdate README in the repository with project details.
Add references to references.bib of important information.
Copy NGI deliveries to data folder (see launch page).
Link relevant raw data in
data/raw-data.Update
assembly_parameters.ymlto point to files indata/raw-data.Run analyses (
./run_nextflow.sh)Refer to the other pages here for more in-depth descriptions of the protocols.
The template provides an organised folder structure, and skeleton files to quickly start analyzing.
Analyses are primarily run on Uppmax or PDC. Github is used as the primary repository, and analysis files should be tracked and pushed regularly.
1.2 Running a test assembly analysis
Follow the steps above to make a repository for a test species. If you would like to use real data then feel free to use Laetiporus sulphureus (Chicken of the Woods).
From the Data tab, download the bam file for PacBio HiFi into the deliveries folder:
wget ftp://ftp.sra.ebi.ac.uk/vol1/run/ERR680/ERR6808041/m64229e_210602_121910.ccs.bc1020_BAK8B_OA--bc1020_BAK8B_OA.bamand the FastQ files for HiC (Arima v2) into the deliveries folder:
wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR668/000/ERR6688740/ERR6688740_1.fastq.gz
wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR668/000/ERR6688740/ERR6688740_2.fastq.gzSymlink the files into appropriate folders under raw-data.
Then edit the assembly_parameters.yml to point to the data linked under raw-data, using the bash snippets in the assembly_parameters.yml to help you write the input file.
Update the workflow_parameters.yml and change the mitohifi.code parameter to 4 (see NCBI Taxonomy Browser).
Finally, open a screen session and then run the launch script (./run_nextflow.sh).