Contents

1 Introduction

Example of epistack outputs

Example of epistack outputs

The epistack package main objective is the visualizations of stacks of genomic tracks (such as, but not restricted to, ChIP-seq or DNA methyation data) centered at genomic regions of interest. epistack needs three different inputs:

Each inputs are then combined in a single GRanges object use by epistack’s ploting functions.

After introducing epistack’s plotting capacity, this document will present two use cases:

2 Epistack visualisation

Epistack is a visualisation package. It uses a GRanges object as input, with matrices embeded as metadata columns (mcols()). We will discuss how to build such input obects in the next section. For now on, we will focus on the visualisation functions using the example dataset included in the package.

The dataset can be accessed with:

library(GenomicRanges)
library(epistack)

data("stackepi")
dim(mcols(stackepi))
#> [1] 693  54
stackepi[, 1:6]
#> GRanges object with 693 ranges and 6 metadata columns:
#>                      seqnames            ranges strand |            gene_id
#>                         <Rle>         <IRanges>  <Rle> |        <character>
#>   ENSSSCG00000016737       18 50563134-50566857      - | ENSSSCG00000016737
#>   ENSSSCG00000036350       18 46205448-46215858      + | ENSSSCG00000036350
#>   ENSSSCG00000037869       18 23881967-23924133      + | ENSSSCG00000037869
#>   ENSSSCG00000016444       18   6334193-6348959      + | ENSSSCG00000016444
#>   ENSSSCG00000016714       18 47169867-47173866      + | ENSSSCG00000016714
#>                  ...      ...               ...    ... .                ...
#>   ENSSSCG00000043874       18 27323765-27356703      - | ENSSSCG00000043874
#>   ENSSSCG00000050367       18 36937862-37021223      + | ENSSSCG00000050367
#>   ENSSSCG00000032793       18 36674594-36717073      - | ENSSSCG00000032793
#>   ENSSSCG00000024209       18 12396535-12396732      + | ENSSSCG00000024209
#>   ENSSSCG00000048227       18 40765348-40779500      + | ENSSSCG00000048227
#>                            exp     score  window_1  window_2  window_3
#>                      <numeric> <numeric> <numeric> <numeric> <numeric>
#>   ENSSSCG00000016737  1213.478         0  0.910984  0.889177  0.879568
#>   ENSSSCG00000036350   771.328         0  0.388921  0.371259  0.425591
#>   ENSSSCG00000037869   270.641         0  0.595544  0.621201  0.616013
#>   ENSSSCG00000016444   261.168         0  0.881246  0.843490  0.706214
#>   ENSSSCG00000016714   193.318         0  0.877790  0.884266  0.899063
#>                  ...       ...       ...       ...       ...       ...
#>   ENSSSCG00000043874         0         0  0.355181  0.255744  0.499475
#>   ENSSSCG00000050367         0         0  0.830908  0.895926  0.892254
#>   ENSSSCG00000032793         0         0  0.671551  0.618429  0.614749
#>   ENSSSCG00000024209         0         0  0.585595  0.586773  0.562818
#>   ENSSSCG00000048227         0         0  0.801947  0.775026  0.835155
#>   -------
#>   seqinfo: 351 sequences from an unspecified genome; no seqlengths

2.1 The plotEpisatck() function

This dataset can be visualised with the plotEpistack() function. The first parameter is the input GRanges object.

The second parameter, patterns specifies which columns of mcols(gr) should be displayed as heatmap(s). The patterns values are prefixes or regular expression that should match a set of column names. In the stackepi dataset, only one track is present, with columns names starting with window. Note that it is possible to have several different tracks embeded in the same GRanges object, as demonstarted in the next sections.

An aditional metric_col is used, to display score associated with each anchor region, such as expression values or peak scores. Optionaly, the metric_col can be transformed before ploting using the metric_transfunc parameters.

plotEpistack(
  stackepi,
  pattern = "^window_", metric_col = "exp",
  ylim = c(0, 1), zlim = c(0, 1),
  x_labels = c("-2.5kb", "TSS", "+2.5kb"),
  titles = "DNA methylation", legends = "%mCpG",
  metric_title = "Expression", metric_label = "log10(TPM+1)",
  metric_transfunc = function(x) log10(x+1)
)

If a bin column is present, it is used to generate one average profile per bin.

stackepi <- addBins(stackepi, nbins = 5)

plotEpistack(
  stackepi,
  pattern = "^window_", metric_col = "exp",
  ylim = c(0, 1), zlim = c(0, 1),
  x_labels = c("-2.5kb", "TSS", "+2.5kb"),
  titles = "DNA methylation", legends = "%mCpG",
  metric_title = "Expression", metric_label = "log10(TPM+1)",
  metric_transfunc = function(x) log10(x+1)
)

Colours can be changed using dedicated parameters:

plotEpistack(
  stackepi,
  pattern = "^window_", metric_col = "exp",
  ylim = c(0, 1), zlim = c(0, 1),
  x_labels = c("-2.5kb", "TSS", "+2.5kb"),
  titles = "DNA methylation", legends = "%mCpG",
  metric_title = "Expression", metric_label = "log10(TPM+1)",
  metric_transfunc = function(x) log10(x+1),
  tints = "dodgerblue",
  bin_palette = rainbow
)