1 Introduction

CRISPR screens are becoming more and more common, and as such, so is the need to easily interpret, visualize, compare, and explore the results of these assays.

CRISPRball is a Shiny application to explore, visualize, filter, and integrate CRISPR screens with public data and multiple datasets. In particular, it allows for publication-quality figure generation including full aesthetic customization and interactive labeling, filtering of results using DepMap Common Essential genes, simple comparisons between datasets/timepoints/treatments, etc.

It is designed for end users and may be particularly useful for bioinformatics/genome editing cores that perform basic analyses before returning results to users. Pointing users to the online version of the app (or a hosted one) will allow them to quickly wade through and interpret their data.

Currently, it supports the output from MAGeCK RRA and MLE analysis methods. This package supplements the MAGeCKFlute bioconductor package, adding additional functionality, visualizations, and a Shiny interface to explore the results generated with that package.

Support for the output of additional analysis tools and methods will be added upon request.

1.1 Installation

CRISPRball is available on Bioconductor and can be installed as follows:

if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install("CRISPRball")

1.2 Usage

Starting the app is as simple as calling the CRISPRball function.

library("CRISPRball")
CRISPRball()

Users can then upload their data within the app, which will enable specific tabs in the application as the data is provided.

Screenshot of the `CRISPRball` application, when launched as a server where users can directly upload MAGeCK RRA or MLE output. Information on the format of the expected data are provided in the following sections.

Figure 1: Screenshot of the CRISPRball application, when launched as a server where users can directly upload MAGeCK RRA or MLE output
Information on the format of the expected data are provided in the following sections.

One can also pass their input data directly as input - all that are needed are file paths to the MAGeCK output files.

Passing data directly can be useful when hosting the app on a local Shiny server where having pre-loaded data for the user is wanted. This is particularly useful for core or shared resource facilities that perform basic analyses for end-users.

1.3 MAGeCK RRA Output

In this case, we’ll use the example output from the third MAGeCK tutorial.

In this example, the two datasets are just reverse comparisons (ESC vs plasmid & plasmid vs ESC) where DDX27 has been manually altered in the ESC vs plasmid comparison to no longer be a significant hit.

# Create lists of results summaries for each dataset.
d1.genes <- read.delim(system.file("extdata", "esc1.gene_summary.txt",
    package = "CRISPRball"
), check.names = FALSE)
d2.genes <- read.delim(system.file("extdata", "plasmid.gene_summary.txt",
    package = "CRISPRball"
), check.names = FALSE)

d1.sgrnas <- read.delim(system.file("extdata", "esc1.sgrna_summary.txt",
    package = "CRISPRball"
), check.names = FALSE)
d2.sgrnas <- read.delim(system.file("extdata", "plasmid.sgrna_summary.txt",
    package = "CRISPRball"
), check.names = FALSE)

count.summ <- read.delim(system.file("extdata", "escneg.countsummary.txt",
    package = "CRISPRball"
), check.names = FALSE)
norm.counts <- read.delim(system.file("extdata", "escneg.count_normalized.txt",
    package = "CRISPRball"
), check.names = FALSE)

# Look at the first few rows of the gene summary for the ESC vs plasmid comparison.
head(d1.genes)
##      id num  neg|score neg|p-value  neg|fdr neg|rank neg|goodsgrna neg|lfc
## 1 PMPCB   5 2.7210e-08  4.9505e-06 0.000990        1             4 -4.4769
## 2 DDX27   5 1.9319e-07  4.9505e-06 1.000000        2             5 -4.7853
## 3 PSMD6   5 2.7462e-07  4.9505e-06 0.000990        3             5 -4.3464
## 4  ORC6   4 5.5840e-07  4.9505e-06 0.000990        4             4 -4.7794
## 5  HARS   5 7.2685e-07  4.9505e-06 0.000990        5             5 -4.7221
## 6 PSMB4   5 2.1014e-06  1.4851e-05 0.002122        6             5 -3.4927
##   pos|score pos|p-value  pos|fdr pos|rank pos|goodsgrna pos|lfc
## 1   0.47199     0.63035 0.999995      582             1 -4.4769
## 2   1.00000     1.00000 1.000000     1000             0 -4.7853
## 3   0.99787     0.99790 0.999995      986             0 -4.3464
## 4   1.00000     1.00000 0.999995      999             0 -4.7794
## 5   1.00000     1.00000 0.999995      998             0 -4.7221
## 6   0.99999     1.00000 0.999995      997             0 -3.4927

We can then provide this data to the CRISPRball function.

genes <- list(ESC = d1.genes, plasmid = d2.genes)
sgrnas <- list(ESC = d1.sgrnas, plasmid = d2.sgrnas)

CRISPRball(
    gene.data = genes, sgrna.data = sgrnas,
    count.summary = count.summ, norm.counts = norm.counts
)

1.4 MAGeCK MLE Output

CRISPRball also supports the MLE output from MAGeCK. In this case, we’ll use the example data from the fourth MAGeCK tutorial.

# Create lists of results summaries for each dataset.
genes <- read_mle_gene_summary(system.file("extdata", "beta_leukemia.gene_summary.txt",
    package = "CRISPRball"
))

count.summ <- read.delim(system.file("extdata", "escneg.countsummary.txt",
    package = "CRISPRball"
), check.names = FALSE)
norm.counts <- read.delim(system.file("extdata", "escneg.count_normalized.txt",
    package = "CRISPRball"
), check.names = FALSE)

CRISPRball(
    gene.data = genes, 
    count.summary = count.summ, norm.counts = norm.counts
)

1.5 The QC Tab

On load, the application will display the QC tab, which provides multiple interactive plots to assess the quality control of all samples in the dataset. Plots include those that assess the Gini Index (a measure of read distribution inequality), counts of fully depleted sgRNAs, percentage of reads mapped, read distributions across sgRNAs, correlation matrix between samples, and a PCA plot containing all samples.

Controls to adjust the plots are provided in the sidebar in the left side of the application. Plots are re-sizable and easily download as SVGs using the plotly controls. In addition, interactive HTML versions of the plots can be downloaded with the Download buttons above or below each plot.