CRISPRball 1.0.0
CRISPR screens are becoming more and more common, and as such, so is the need to easily interpret, visualize, compare, and explore the results of these assays.
CRISPRball is a Shiny application to explore, visualize, filter, and integrate CRISPR screens with public data and multiple datasets. In particular, it allows for publication-quality figure generation including full aesthetic customization and interactive labeling, filtering of results using DepMap Common Essential genes, simple comparisons between datasets/timepoints/treatments, etc.
It is designed for end users and may be particularly useful for bioinformatics/genome editing cores that perform basic analyses before returning results to users. Pointing users to the online version of the app (or a hosted one) will allow them to quickly wade through and interpret their data.
Currently, it supports the output from MAGeCK RRA and MLE analysis methods. This package supplements the MAGeCKFlute bioconductor package, adding additional functionality, visualizations, and a Shiny interface to explore the results generated with that package.
Support for the output of additional analysis tools and methods will be added upon request.
CRISPRball is available on Bioconductor and can be installed as follows:
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("CRISPRball")
Starting the app is as simple as calling the CRISPRball
function.
library("CRISPRball")
CRISPRball()
Users can then upload their data within the app, which will enable specific tabs in the application as the data is provided.
One can also pass their input data directly as input - all that are needed are file paths to the MAGeCK output files.
Passing data directly can be useful when hosting the app on a local Shiny server where having pre-loaded data for the user is wanted. This is particularly useful for core or shared resource facilities that perform basic analyses for end-users.
In this case, we’ll use the example output from the third MAGeCK tutorial.
In this example, the two datasets are just reverse comparisons (ESC vs plasmid & plasmid vs ESC) where DDX27 has been manually altered in the ESC vs plasmid comparison to no longer be a significant hit.
# Create lists of results summaries for each dataset.
d1.genes <- read.delim(system.file("extdata", "esc1.gene_summary.txt",
package = "CRISPRball"
), check.names = FALSE)
d2.genes <- read.delim(system.file("extdata", "plasmid.gene_summary.txt",
package = "CRISPRball"
), check.names = FALSE)
d1.sgrnas <- read.delim(system.file("extdata", "esc1.sgrna_summary.txt",
package = "CRISPRball"
), check.names = FALSE)
d2.sgrnas <- read.delim(system.file("extdata", "plasmid.sgrna_summary.txt",
package = "CRISPRball"
), check.names = FALSE)
count.summ <- read.delim(system.file("extdata", "escneg.countsummary.txt",
package = "CRISPRball"
), check.names = FALSE)
norm.counts <- read.delim(system.file("extdata", "escneg.count_normalized.txt",
package = "CRISPRball"
), check.names = FALSE)
# Look at the first few rows of the gene summary for the ESC vs plasmid comparison.
head(d1.genes)
## id num neg|score neg|p-value neg|fdr neg|rank neg|goodsgrna neg|lfc
## 1 PMPCB 5 2.7210e-08 4.9505e-06 0.000990 1 4 -4.4769
## 2 DDX27 5 1.9319e-07 4.9505e-06 1.000000 2 5 -4.7853
## 3 PSMD6 5 2.7462e-07 4.9505e-06 0.000990 3 5 -4.3464
## 4 ORC6 4 5.5840e-07 4.9505e-06 0.000990 4 4 -4.7794
## 5 HARS 5 7.2685e-07 4.9505e-06 0.000990 5 5 -4.7221
## 6 PSMB4 5 2.1014e-06 1.4851e-05 0.002122 6 5 -3.4927
## pos|score pos|p-value pos|fdr pos|rank pos|goodsgrna pos|lfc
## 1 0.47199 0.63035 0.999995 582 1 -4.4769
## 2 1.00000 1.00000 1.000000 1000 0 -4.7853
## 3 0.99787 0.99790 0.999995 986 0 -4.3464
## 4 1.00000 1.00000 0.999995 999 0 -4.7794
## 5 1.00000 1.00000 0.999995 998 0 -4.7221
## 6 0.99999 1.00000 0.999995 997 0 -3.4927
We can then provide this data to the CRISPRball
function.
genes <- list(ESC = d1.genes, plasmid = d2.genes)
sgrnas <- list(ESC = d1.sgrnas, plasmid = d2.sgrnas)
CRISPRball(
gene.data = genes, sgrna.data = sgrnas,
count.summary = count.summ, norm.counts = norm.counts
)
CRISPRball also supports the MLE output from MAGeCK. In this case, we’ll use the example data from the fourth MAGeCK tutorial.
# Create lists of results summaries for each dataset.
genes <- read_mle_gene_summary(system.file("extdata", "beta_leukemia.gene_summary.txt",
package = "CRISPRball"
))
count.summ <- read.delim(system.file("extdata", "escneg.countsummary.txt",
package = "CRISPRball"
), check.names = FALSE)
norm.counts <- read.delim(system.file("extdata", "escneg.count_normalized.txt",
package = "CRISPRball"
), check.names = FALSE)
CRISPRball(
gene.data = genes,
count.summary = count.summ, norm.counts = norm.counts
)
On load, the application will display the QC tab, which provides multiple interactive plots to assess the quality control of all samples in the dataset. Plots include those that assess the Gini Index (a measure of read distribution inequality), counts of fully depleted sgRNAs, percentage of reads mapped, read distributions across sgRNAs, correlation matrix between samples, and a PCA plot containing all samples.
Controls to adjust the plots are provided in the sidebar in the left side of the application. Plots are re-sizable and easily download as SVGs using the plotly controls. In addition, interactive HTML versions of the plots can be downloaded with the Download buttons above or below each plot.