Settings

Input files with qc-stats were: data/sum/qc_summary.csv

Sampleinfo file: /proj/b2017179/private/scRNAseq/meta/Meta_data_norun1.csv

Gene expression file: data/sum/merge.rpkmforgenes_rpkm.txt

The file /proj/b2017179/private/analysis/s_petropoulos_1701/source/qc_settings.yaml specifies settings:

Output except for this report will be 3 files:

Filtering based on QC-stats

First, all the QC-stats that will be used for filtering are plotted as according to each criteria. The filtering criteria are defined in /proj/b2017179/private/analysis/s_petropoulos_1701/source/qc_settings.yaml under section filter_settings: filters

Histograms with all QC-data, red for failed cells and blue for non-failed cells according to each criteria, cutoffs defined in headings for each section

Filtered cells

Scatter plots with all QC-data, red for filtered cells and black for non-filtered cells

Overlap of filtered cells per filtering criteria. Each box shows the intersection of 2 different filtering criteria, and along the diagonal the total number of cells below cutoff for one criteria is shown. The final column “unique” shows number of cells are only below cutoff for that particular stat.

Summary of filtering per batch

The number of cells per batch (as defined by section “batches” in /proj/b2017179/private/analysis/s_petropoulos_1701/source/qc_settings.yaml) and proportion of filtered cells per batch.

Summary of all qc-stats per batch

The distribution of each of the QC-stats split by the batches are plotted as violin plots, together with a histogram for all samples. If specified, the section “plot” defines which qc-measures to plot stats for, if none is specified, all columns in the input file are plotted.

Scatterplot of the qc-stats colored by batch: Embryo. Colors per batch are defined the same way as in the first violin plot on previous plots.

Note: the script will only plot this scatter for the first batch defined in section “batches” of /proj/b2017179/private/analysis/s_petropoulos_1701/source/qc_settings.yaml.

PCA contribution of qc-stats

Summary of expression data using scater package. Removing 35 cells from expression matrix with 834 samples. Keeping 47882 genes out of 58313, that have higher sum expression than 15.98.

Top expressed genes

Highest expressed genes colored by batch: Embryo.

Explained variance

Explained variance for different QC-measures/batch info.

PCA

PCA with PC1 and PC2 coloured by different QC-measures/batch info.

PC correlation of different QC-measures

Top correlated PCs for each of the different QC-measures/batch info

Expression of marker genes in PCA

Plot marker genes if any are specified in /proj/b2017179/private/analysis/s_petropoulos_1701/source/qc_settings.yaml under section markers.

Reproducibility

This report was created with the script render_qc_summary.R that in turn calls rmarkdown script make_qc_summary_report.R The scriptis requires a yaml file where filtering cutoffs and plotting options are set and passed with the flag -c, see file with default settings source/qc_settings.yaml. It also requires a list of one or more infiles that contains all the stats as columns and a sampleinfo file with metadata about batches etc.

Dependencies:

This report was executed as:

/proj/b2017179/nobackup/private/programs/conda/envs/spet_1701/lib/R/bin/exec/R –slave –no-restore –file=/proj/b2017179/private/analysis/qc-summary_scrnaseq/source/render_qc_summary.R –args -i data/sum/qc_summary.csv -e data/sum/merge.rpkmforgenes_rpkm.txt -o data/sum/qc_summary/qc_summary -s /proj/b2017179/private/scRNAseq/meta/Meta_data_norun1.csv -c /proj/b2017179/private/analysis/s_petropoulos_1701/source/qc_settings.yaml

Which was called with command:

Rscript /proj/b2017179/private/analysis/qc-summary_scrnaseq/source/render_qc_summary.R -i data/sum/qc_summary.csv -e data/sum/merge.rpkmforgenes_rpkm.txt -o data/sum/qc_summary/qc_summary -s /proj/b2017179/private/scRNAseq/meta/Meta_data_norun1.csv -c /proj/b2017179/private/analysis/s_petropoulos_1701/source/qc_settings.yaml

Session info

sessionInfo()
## R version 3.3.2 (2016-10-31)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Scientific Linux release 6.9 (Carbon)
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] parallel  methods   stats     graphics  grDevices utils     datasets 
## [8] base     
## 
## other attached packages:
##  [1] scater_1.2.0        ggplot2_2.2.1       Biobase_2.34.0     
##  [4] BiocGenerics_0.20.0 vioplot_0.2         sm_2.2-5.4         
##  [7] gplots_3.0.1        yaml_2.1.13         optparse_1.3.2     
## [10] rmarkdown_1.3      
## 
## loaded via a namespace (and not attached):
##  [1] tximport_1.2.0       beeswarm_0.2.3       gtools_3.5.0        
##  [4] locfit_1.5-9.1       reshape2_1.4.2       rhdf5_2.18.0        
##  [7] lattice_0.20-34      colorspace_1.3-2     htmltools_0.3.5     
## [10] getopt_1.20.0        stats4_3.3.2         viridisLite_0.2.0   
## [13] XML_3.98-1.5         DBI_0.5-1            matrixStats_0.52.2  
## [16] plyr_1.8.4           zlibbioc_1.20.0      stringr_1.2.0       
## [19] munsell_0.4.3        gtable_0.2.0         caTools_1.17.1      
## [22] evaluate_0.10        memoise_1.1.0        labeling_0.3        
## [25] knitr_1.15.1         IRanges_2.8.2        biomaRt_2.28.0      
## [28] vipor_0.4.5          httpuv_1.3.3         AnnotationDbi_1.38.0
## [31] Rcpp_0.12.10         KernSmooth_2.23-15   xtable_1.8-2        
## [34] edgeR_3.16.5         scales_0.4.1         backports_1.0.4     
## [37] gdata_2.17.0         limma_3.30.13        S4Vectors_0.12.2    
## [40] mime_0.5             gridExtra_2.2.1      rjson_0.2.15        
## [43] digest_0.6.12        stringi_1.1.5        dplyr_0.5.0         
## [46] shiny_1.0.3          grid_3.3.2           rprojroot_1.1       
## [49] tools_3.3.2          bitops_1.0-6         magrittr_1.5        
## [52] lazyeval_0.2.0       RCurl_1.95-4.8       tibble_1.3.0        
## [55] RSQLite_1.1-1        data.table_1.10.4    ggbeeswarm_0.5.3    
## [58] shinydashboard_0.6.1 assertthat_0.1       viridis_0.4.0       
## [61] R6_2.2.1