HTML reports for a set of regions

If you wish, you can view this vignette online here.

regionReport (Collado-Torres, Jaffe, and Leek, 2014) creates HTML reports styled with knitrBootstrap (Hester, 2013) for a set of regions such as derfinder (Collado-Torres, Frazee, Jaffe, and Leek, 2014) results.

Currently, this package includes a basic exploration analysis of derfinder results which we expect users to be interested in reproducing with their own data. The analysis is written in R Markdown format and derfinderReport() takes the results from derfinder, performs a couple of setup operations, and then relies on knitr (Xie, 2014), rmarkdown (Allaire, McPherson, Xie, Wickham, et al., 2014), and knitrBootstrap (Hester, 2013) for generating the report.

Using regionReport

General case

This is currently under construction.

derfinder case

Goodies in this report are powered by ggbio (Yin, Cook, and Lawrence, 2012) and ggplot2 (Wickham, 2009).

Run derfinder

Prior to using regionReport::derfinderReport() you must use derfinder to analyze a specific data set. While there are many ways to do so, we recommend using analyzeChr() with the same prefix argument. Then merging the results with mergeResults().

Below, we run derfinder for the example data included in the package. The steps are:

  1. Load derfinder
  2. Create a directory where we'll store the results
  3. Generate the pre-requisites for the models to use with the example data
  4. Generate the statistical models
  5. Analyze the example data for chr21
  6. Merge the results (only one chr in this case, but in practice there'll be more)
## Load derfinder
library('derfinder')

## The output will be saved in the 'report' directory
dir.create('report', showWarnings = FALSE, recursive = TRUE)

The following code runs derfinder.

## Save the current path
initialPath <- getwd()
setwd(file.path(initialPath, 'report'))

## Generate output from derfinder

## Collapse the coverage information
collapsedFull <- collapseFullCoverage(list(genomeData$coverage), 
verbose=TRUE)

## Calculate library size adjustments
sampleDepths <- sampleDepth(collapsedFull, probs=c(0.5), nonzero=TRUE, 
verbose=TRUE)

## Build the models
group <- genomeInfo$pop
adjustvars <- data.frame(genomeInfo$gender)
models <- makeModels(sampleDepths, testvars=group, adjustvars=adjustvars)

## Analyze chromosome 21
analysis <- analyzeChr(chr='21', coverageInfo=genomeData, models=models, 
cutoffFstat=1, cutoffType='manual', seeds=20140330, groupInfo=group, 
mc.cores=1, writeOutput=TRUE, returnOutput=TRUE)

## Save the stats options for later
optionsStats <- analysis$optionsStats

## Change the directory back to the original one
setwd(initialPath)

For convenience, we have included the derfinder results as part of regionReport. Note that the above functions are routinely checked as part of derfinder.

## Copy previous results
file.copy(system.file(file.path('extdata', 'chr21'), package='derfinder', 
mustWork=TRUE), 'report', recursive=TRUE)
## [1] TRUE

Next, proceed to merging the results.

## Merge the results from the different chromosomes. In this case, there's 
## only one: chr21
mergeResults(chrs = 'chr21', prefix = 'report',
    genomicState = genomicState$fullGenome)
## 2014-11-05 22:09:24 mergeResults: Saving options used
## 2014-11-05 22:09:24 Loading chromosome chr21
## Neither 'cutoffFstatUsed' nor 'optionsStats' were supplied, so the FWER calculation step will be skipped.
## 2014-11-05 22:09:24 mergeResults: Saving fullNullSummary
## 2014-11-05 22:09:24 mergeResults: Re-calculating the p-values
## 2014-11-05 22:09:24 mergeResults: Saving fullRegions
## 2014-11-05 22:09:24 mergeResults: assigning genomic states
## 2014-11-05 22:09:25 annotateRegions: counting
## 2014-11-05 22:09:25 annotateRegions: annotating
## 2014-11-05 22:09:25 mergeResults: Saving fullAnnotatedRegions
## 2014-11-05 22:09:25 mergeResults: Saving fullFstats
## 2014-11-05 22:09:25 mergeResults: Saving fullTime

Create report

Once the derfinder output has been generated and merged, use derfinderReport() to create the HTML report.

## Load derfindeReport
library('regionReport')

## Generate the HTML report
report <- derfinderReport(prefix='report', browse=FALSE,
    nBestRegions=15, makeBestClusters=TRUE, outdir='html',
    fullCov=list('21'=genomeDataRaw$coverage), optionsStats=optionsStats)

Once the output is generated, you can browse the report from R using browseURL() as shown below.

## Browse the report
browseURL(report)

You can compare the resulting report with the pre-compiled report using the following code.

browseURL(system.file(file.path('basicExploration', 'basicExploration.html'),  
    package = 'regionReport', mustWork = TRUE))

Notes

Note that the reports require an active Internet connection to render correctly.

The report is self-explanatory and will change some of the text depending on the input options.

If the report is taking too long to compile (say more than 3 hours), you might want to consider setting nBestCluters to a small number or even set makeBestClusters to FALSE.

Advanced arguments

If you are interested in using the advanced arguments, use derfinder::advancedArg() as shown below:

## URLs to advanced arguemtns
derfinder::advancedArg('derfinderReport', package = 'regionReport',
    browse = FALSE)
## Set browse = TRUE if you want to open them in your browser

Reproducibility

This package was made possible thanks to:

Code for creating the vignette

## Create the vignette
library('knitrBootstrap') 

knitrBootstrapFlag <- packageVersion('knitrBootstrap') < '1.0.0'
if(knitrBootstrapFlag) {
    ## CRAN version
    system.time(knit_bootstrap('regionReport.Rmd', chooser=c('boot', 'code'), show_code = TRUE))
    unlink('regionReport.md')
} else {
    ## GitHub version
    library('rmarkdown')
    system.time(render('regionReport.Rmd',
        'knitrBootstrap::bootstrap_document'))
}
## Note: if you prefer the knitr version use:
# library('rmarkdown')
# system.time(render('regionReport.Rmd', 'html_document'))

## Extract the R code
library('knitr')
knit('regionReport.Rmd', tangle = TRUE)

## Copy report output to be distributed with the package for comparison 
## purposes
if(gsub('.*/', '', getwd()) == 'realVignettes') {
    file.copy(file.path('report', 'html', 'basicExploration.html'),
        file.path('..', '..', 'inst', 'basicExploration',
            'basicExploration.html'), overwrite=TRUE)
} else {
    file.copy(file.path('report', 'html', 'basicExploration.html'),
        file.path('..', 'inst', 'basicExploration', 'basicExploration.html'),
            overwrite=TRUE)
}

       
## Clean up
file.remove('regionReportRef.bib')
#unlink('regionReport_files', recursive=TRUE)
unlink('report', recursive = TRUE)

Date the vignette was generated.

## [1] "2014-11-05 22:10:22 PST"

Wallclock time spent generating the vignette.

## Time difference of 1.185 mins

R session information.

## Session info---------------------------------------------------------------------------------------
##  setting  value                                             
##  version  R Under development (unstable) (2014-11-03 r66928)
##  system   x86_64, linux-gnu                                 
##  ui       X11                                               
##  language en_US:                                            
##  collate  C                                                 
##  tz       
## Packages-------------------------------------------------------------------------------------------
##  package           * version  date       source        
##  AnnotationDbi       1.29.1   2014-11-06 Bioconductor  
##  BBmisc              1.8      2014-10-30 CRAN (R 3.2.0)
##  BSgenome            1.35.5   2014-11-06 Bioconductor  
##  BatchJobs           1.5      2014-10-30 CRAN (R 3.2.0)
##  Biobase             2.27.0   2014-11-06 Bioconductor  
##  BiocGenerics        0.13.0   2014-11-06 Bioconductor  
##  BiocParallel        1.1.5    2014-11-06 Bioconductor  
##  Biostrings          2.35.2   2014-11-06 Bioconductor  
##  DBI                 0.3.1    2014-09-24 CRAN (R 3.2.0)
##  Formula             1.1.2    2014-07-13 CRAN (R 3.2.0)
##  GGally              0.4.8    2014-08-26 CRAN (R 3.2.0)
##  GenomeInfoDb        1.3.6    2014-11-06 Bioconductor  
##  GenomicAlignments   1.3.5    2014-11-06 Bioconductor  
##  GenomicFeatures     1.19.6   2014-11-06 Bioconductor  
##  GenomicFiles        1.3.8    2014-11-06 Bioconductor  
##  GenomicRanges       1.19.4   2014-11-06 Bioconductor  
##  Hmisc               3.14.5   2014-09-12 CRAN (R 3.2.0)
##  IRanges             2.1.6    2014-11-06 Bioconductor  
##  MASS                7.3.35   2014-09-30 CRAN (R 3.2.0)
##  Matrix              1.1.4    2014-06-15 CRAN (R 3.2.0)
##  OrganismDbi         1.9.0    2014-11-06 Bioconductor  
##  R.methodsS3         1.6.1    2014-01-05 CRAN (R 3.2.0)
##  RBGL                1.43.0   2014-11-06 Bioconductor  
##  RColorBrewer        1.0.5    2011-06-17 CRAN (R 3.2.0)
##  RCurl               1.95.4.3 2014-07-29 CRAN (R 3.2.0)
##  RJSONIO             1.3.0    2014-07-28 CRAN (R 3.2.0)
##  RSQLite             1.0.0    2014-10-25 CRAN (R 3.2.0)
##  Rcpp                0.11.3   2014-09-29 CRAN (R 3.2.0)
##  RefManageR          0.8.40   2014-10-29 CRAN (R 3.2.0)
##  Rsamtools           1.19.8   2014-11-06 Bioconductor  
##  S4Vectors           0.5.4    2014-11-06 Bioconductor  
##  VariantAnnotation   1.13.6   2014-11-06 Bioconductor  
##  XML                 3.98.1.1 2013-06-20 CRAN (R 3.2.0)
##  XVector             0.7.2    2014-11-06 Bioconductor  
##  acepack             1.3.3.3  2013-05-03 CRAN (R 3.2.0)
##  base64enc           0.1.2    2014-06-26 CRAN (R 3.2.0)
##  bibtex              0.3.6    2013-07-29 CRAN (R 3.2.0)
##  biomaRt             2.23.0   2014-11-06 Bioconductor  
##  biovizBase          1.15.0   2014-11-06 Bioconductor  
##  bitops              1.0.6    2013-08-17 CRAN (R 3.2.0)
##  brew                1.0.6    2011-04-13 CRAN (R 3.2.0)
##  bumphunter          1.7.0    2014-11-06 Bioconductor  
##  checkmate           1.5.0    2014-10-19 CRAN (R 3.2.0)
##  cluster             1.15.3   2014-09-04 CRAN (R 3.2.0)
##  codetools           0.2.9    2014-08-21 CRAN (R 3.2.0)
##  colorspace          1.2.4    2013-09-30 CRAN (R 3.2.0)
##  derfinder         * 1.1.9    2014-11-06 Bioconductor  
##  derfinderHelper     1.1.5    2014-11-06 Bioconductor  
##  derfinderPlot       1.1.5    2014-11-06 Bioconductor  
##  devtools          * 1.6.1    2014-10-07 CRAN (R 3.2.0)
##  dichromat           2.0.0    2013-01-24 CRAN (R 3.2.0)
##  digest              0.6.4    2013-12-03 CRAN (R 3.2.0)
##  doRNG               1.6      2014-03-07 CRAN (R 3.2.0)
##  evaluate            0.5.5    2014-04-29 CRAN (R 3.2.0)
##  fail                1.2      2013-09-19 CRAN (R 3.2.0)
##  foreach             1.4.2    2014-04-11 CRAN (R 3.2.0)
##  foreign             0.8.61   2014-03-28 CRAN (R 3.2.0)
##  formatR             1.0      2014-08-25 CRAN (R 3.2.0)
##  ggbio               1.15.0   2014-11-06 Bioconductor  
##  ggplot2             1.0.0    2014-05-21 CRAN (R 3.2.0)
##  graph               1.45.0   2014-11-06 Bioconductor  
##  gridExtra           0.9.1    2012-08-09 CRAN (R 3.2.0)
##  gtable              0.1.2    2012-12-05 CRAN (R 3.2.0)
##  htmltools           0.2.6    2014-09-08 CRAN (R 3.2.0)
##  httr                0.5      2014-09-02 CRAN (R 3.2.0)
##  iterators           1.0.7    2014-04-11 CRAN (R 3.2.0)
##  knitcitations     * 1.0.4    2014-10-28 CRAN (R 3.2.0)
##  knitr               1.7      2014-10-13 CRAN (R 3.2.0)
##  knitrBootstrap    * 0.9.0    2013-10-17 CRAN (R 3.2.0)
##  lattice             0.20.29  2014-04-04 CRAN (R 3.2.0)
##  latticeExtra        0.6.26   2013-08-15 CRAN (R 3.2.0)
##  locfit              1.5.9.1  2013-04-20 CRAN (R 3.2.0)
##  lubridate           1.3.3    2013-12-31 CRAN (R 3.2.0)
##  markdown            0.7.4    2014-08-24 CRAN (R 3.2.0)
##  matrixStats         0.10.3   2014-10-15 CRAN (R 3.2.0)
##  memoise             0.2.1    2014-04-22 CRAN (R 3.2.0)
##  mgcv                1.8.3    2014-08-29 CRAN (R 3.2.0)
##  munsell             0.4.2    2013-07-11 CRAN (R 3.2.0)
##  nlme                3.1.118  2014-10-07 CRAN (R 3.2.0)
##  nnet                7.3.8    2014-03-28 CRAN (R 3.2.0)
##  pkgmaker            0.22     2014-05-14 CRAN (R 3.2.0)
##  plyr                1.8.1    2014-02-26 CRAN (R 3.2.0)
##  proto               0.3.10   2012-12-22 CRAN (R 3.2.0)
##  qvalue              1.41.0   2014-11-06 Bioconductor  
##  regionReport      * 1.1.6    2014-11-06 Bioconductor  
##  registry            0.2      2012-01-24 CRAN (R 3.2.0)
##  reshape             0.8.5    2014-04-23 CRAN (R 3.2.0)
##  reshape2            1.4      2014-04-23 CRAN (R 3.2.0)
##  rmarkdown           0.3.3    2014-09-17 CRAN (R 3.2.0)
##  rngtools            1.2.4    2014-03-06 CRAN (R 3.2.0)
##  rpart               4.1.8    2014-03-28 CRAN (R 3.2.0)
##  rstudioapi          0.1      2014-03-27 CRAN (R 3.2.0)
##  rtracklayer         1.27.3   2014-11-06 Bioconductor  
##  scales              0.2.4    2014-04-22 CRAN (R 3.2.0)
##  sendmailR           1.2.1    2014-09-21 CRAN (R 3.2.0)
##  stringr             0.6.2    2012-12-06 CRAN (R 3.2.0)
##  survival            2.37.7   2014-01-22 CRAN (R 3.2.0)
##  xtable              1.7.4    2014-09-12 CRAN (R 3.2.0)
##  zlibbioc            1.13.0   2014-11-06 Bioconductor

Bibliography

This vignette was generated using knitrBootstrap (Hester, 2013) with knitr (Xie, 2014) and rmarkdown (Allaire, McPherson, Xie, Wickham, et al., 2014) running behind the scenes.

Citations made with knitcitations (Boettiger, 2014).

[1] J. Allaire, J. McPherson, Y. Xie, H. Wickham, et al. rmarkdown: Dynamic Documents for R. R package version 0.3.3. 2014. URL: http://CRAN.R-project.org/package=rmarkdown.

[2] S. Arora, M. Morgan, M. Carlson and H. Pages. GenomeInfoDb: Utilities for manipulating chromosome and other 'seqname' identifiers. R package version 1.3.6. 2014.

[3] B. Auguie. gridExtra: functions in Grid graphics. R package version 0.9.1. 2012. URL: http://CRAN.R-project.org/package=gridExtra.

[4] C. Boettiger. knitcitations: Citations for knitr markdown files. R package version 1.0.4. 2014. URL: http://CRAN.R-project.org/package=knitcitations.

[5] M. Carlson. TxDb.Hsapiens.UCSC.hg19.knownGene: Annotation package for TxDb object(s). R package version 3.0.0. 2014.

[6] L. Collado-Torres, A. C. Frazee, A. E. Jaffe and J. T. Leek. derfinder: Annotation-agnostic differential expression analysis of RNA-seq data at base-pair resolution. https://github.com/lcolladotor/derfinder - R package version 1.1.9. 2014. URL: http://www.bioconductor.org/packages/release/bioc/html/derfinder.html.

[7] L. Collado-Torres, A. E. Jaffe and J. T. Leek. derfinderPlot: Plotting functions for derfinder. https://github.com/lcolladotor/derfinderPlot - R package version 1.1.5. 2014. URL: http://www.bioconductor.org/packages/release/bioc/html/derfinderPlot.html.

[8] L. Collado-Torres, A. E. Jaffe and J. T. Leek. regionReport: Generate HTML reports for exploring a set of regions. https://github.com/lcolladotor/regionReport - R package version 1.1.6. 2014. URL: http://www.bioconductor.org/packages/release/bioc/html/regionReport.html.

[9] J. Hester. knitrBootstrap: Knitr Bootstrap framework. R package version 0.9.0. 2013. URL: http://CRAN.R-project.org/package=knitrBootstrap.

[10] M. Lawrence, W. Huber, H. Pagès, P. Aboyoun, et al. “Software for Computing and Annotating Genomic Ranges”. In: PLoS Computational Biology 9 (8 2013). DOI: 10.1371/journal.pcbi.1003118. URL: http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003118}.

[11] E. Neuwirth. RColorBrewer: ColorBrewer palettes. R package version 1.0-5. 2011. URL: http://CRAN.R-project.org/package=RColorBrewer.

[12] R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. Vienna, Austria, 2014. URL: http://www.R-project.org/.

[13] S. Urbanek and J. Horner. Cairo: R graphics device using cairo graphics library for creating high-quality bitmap (PNG, JPEG, TIFF), vector (PDF, SVG, PostScript) and display (X11 and Win32) output. R package version 1.5-6. 2014. URL: http://CRAN.R-project.org/package=Cairo.

[14] H. Wickham. ggplot2: elegant graphics for data analysis. Springer New York, 2009. ISBN: 978-0-387-98140-6. URL: http://had.co.nz/ggplot2/book.

[15] H. Wickham and W. Chang. devtools: Tools to make developing R code easier. R package version 1.6.1. 2014. URL: http://CRAN.R-project.org/package=devtools.

[16] Y. Xie. “knitr: A Comprehensive Tool for Reproducible Research in R”. In: Implementing Reproducible Computational Research. Ed. by V. Stodden, F. Leisch and R. D. Peng. ISBN 978-1466561595. Chapman and Hall/CRC, 2014. URL: http://www.crcpress.com/product/isbn/9781466561595.

[17] T. Yin, D. Cook and M. Lawrence. “ggbio: an R package for extending the grammar of graphics for genomic data”. In: Genome Biology 13.8 (2012), p. R77.

[18] T. Yin, M. Lawrence and D. Cook. biovizBase: Basic graphic utilities for visualization of genomic data. R package version 1.15.0. 2014.

ckage version 1.15.2. 2015.

gionReport'>[8] L. Collado-Torres, A. E. Jaffe and J. T. Leek. regionReport: Generate HTML reports for exploring a set of regions. https://github.com/lcolladotor/regionReport - R package version 1.1.9. 2015. URL: http://www.bioconductor.org/packages/release/bioc/html/regionReport.html.

[9] J. Hester. knitrBootstrap: Knitr Bootstrap framework. R package version 0.9.0. 2013. URL: http://CRAN.R-project.org/package=knitrBootstrap.

[10] A. E. Jaffe, P. Murakami, H. Lee, J. T. Leek, et al. “Bump hunting to identify differentially methylated regions in epigenetic epidemiology studies”. In: International journal of epidemiology 41.1 (2012), pp. 200–209. DOI: 10.1093/ije/dyr238.

[11] M. Lawrence, W. Huber, H. Pagès, P. Aboyoun, et al. “Software for Computing and Annotating Genomic Ranges”. In: PLoS Computational Biology 9 (8 2013). DOI: 10.1371/journal.pcbi.1003118. URL: http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003118}.

[12] E. Neuwirth. RColorBrewer: ColorBrewer Palettes. R package version 1.1-2. 2014. URL: http://CRAN.R-project.org/package=RColorBrewer.

[13] R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. Vienna, Austria, 2015. URL: http://www.R-project.org/.

[14] S. Urbanek and J. Horner. Cairo: R graphics device using cairo graphics library for creating high-quality bitmap (PNG, JPEG, TIFF), vector (PDF, SVG, PostScript) and display (X11 and Win32) output. R package version 1.5-6. 2014. URL: http://CRAN.R-project.org/package=Cairo.

[15] H. Wickham. ggplot2: elegant graphics for data analysis. Springer New York, 2009. ISBN: 978-0-387-98140-6. URL: http://had.co.nz/ggplot2/book.

[16] H. Wickham and W. Chang. devtools: Tools to Make Developing R Packages Easier. R package version 1.7.0. 2015. URL: http://CRAN.R-project.org/package=devtools.

[17] Y. Xie. “knitr: A Comprehensive Tool for Reproducible Research in R”. In: Implementing Reproducible Computational Research. Ed. by V. Stodden, F. Leisch and R. D. Peng. ISBN 978-1466561595. Chapman and Hall/CRC, 2014. URL: http://www.crcpress.com/product/isbn/9781466561595.

[18] T. Yin, D. Cook and M. Lawrence. “ggbio: an R package for extending the grammar of graphics for genomic data”. In: Genome Biology 13.8 (2012), p. R77.

[19] T. Yin, M. Lawrence and D. Cook. biovizBase: Basic graphic utilities for visualization of genomic data. R package version 1.15.2. 2015.

biovizBase'>[19] T. Yin, M. Lawrence and D. Cook. biovizBase: Basic graphic utilities for visualization of genomic data. R package version 1.16.0. 2015.