Contents

1 Introduction

This package is designed for reactome pathway-based analysis. Reactome is an open-source, open access, manually curated and peer-reviewed pathway database.

2 Citation

If you use Reactome1 in published research, please cite G. Yu (2015). In addition, please cite G. Yu (2012) when using compareCluster in clusterProfiler, G Yu (2015) when applying enrichment analysis to NGS data using ChIPseeker.

G Yu, QY He,
ReactomePA: an R/Bioconductor package for reactome pathway analysis and visualization.
Molecular BioSystems 2016, 12(2):477-479.

URL: http://dx.doi.org/10.1039/C5MB00663E

G Yu, LG Wang, Y Han, QY He.
clusterProfiler: an R package for comparing biological themes among gene clusters.
OMICS: A Journal of Integrative Biology 2012, 16(5):284-287.

URL: http://dx.doi.org/10.1089/omi.2011.0118

G Yu, LG Wang, QY He.
ChIPseeker: an R/Bioconductor package for ChIP peak annotation, comparison and visualization.
Bioinformatics 2015, 31(14):2382-2383.

URL: http://dx.doi.org/10.1093/bioinformatics/btv145

3 Supported organisms

Currently ReactomePA supports several model organisms, including ‘celegans’, ‘fly’, ‘human’, ‘mouse’, ‘rat’, ‘yeast’ and ‘zebrafish’. The input gene ID should be Entrez gene ID. We recommend using clusterProfiler::bitr to convert biological IDs. For more detail, please refer to bitr: Biological Id TranslatoR.

4 Pathway Enrichment Analysis

Enrichment analysis is a widely used approach to identify biological themes. Here, we implement hypergeometric model to assess whether the number of selected genes associated with reactome pathway is larger than expected. The p values were calculated based the hypergeometric model2,

library(ReactomePA)
data(geneList)
de <- names(geneList)[abs(geneList) > 1.5]
head(de)
## [1] "4312"  "8318"  "10874" "55143" "55388" "991"
x <- enrichPathway(gene=de,pvalueCutoff=0.05, readable=T)
head(summary(x))
##              ID                             Description GeneRatio  BgRatio
## 68877     68877                    Mitotic Prometaphase    25/248 100/6749
## 69278     69278                     Cell Cycle, Mitotic    49/248 408/6749
## 2500257 2500257 Resolution of Sister Chromatid Cohesion    23/248  92/6749
## 1640170 1640170                              Cell Cycle    54/248 497/6749
## 5663220 5663220            RHO GTPases Activate Formins    21/248 102/6749
## 68886     68886                                 M Phase    30/248 235/6749
##               pvalue     p.adjust       qvalue
## 68877   8.542219e-15 4.459038e-12 3.938412e-12
## 69278   4.753467e-14 1.240655e-11 1.095799e-11
## 2500257 1.068802e-13 1.719537e-11 1.518768e-11
## 1640170 1.317653e-13 1.719537e-11 1.518768e-11
## 5663220 7.266024e-11 7.585729e-09 6.700039e-09
## 68886   1.540540e-09 1.340270e-07 1.183784e-07
##                                                                                                                                                                                                                                                                                                                               geneID
## 68877                                                                                                                                                                          CDCA8/CDC20/CENPE/CCNB2/NDC80/NCAPH/SKA1/CENPM/CENPN/CDK1/ERCC6L/MAD2L1/KIF18A/BIRC5/NCAPG/AURKB/CCNB1/KIF2C/PLK1/BUB1B/ZWINT/CENPU/SPC25/CENPI/TAOK1
## 69278                                CDC45/CDCA8/MCM10/CDC20/FOXM1/KIF23/CENPE/MYBL2/CCNB2/NDC80/TOP2A/NCAPH/RRM2/UBE2C/SKA1/NEK2/CENPM/CENPN/CCNA2/CDK1/ERCC6L/MAD2L1/GINS1/KIF18A/CDT1/BIRC5/NCAPG/AURKB/GINS2/KIF20A/AURKA/CCNB1/MCM5/PTTG1/MCM2/KIF2C/CDC25A/CDC6/PLK1/BUB1B/ZWINT/CENPU/SPC25/CENPI/ESPL1/CCNE1/ORC6/ORC1/TAOK1
## 2500257                                                                                                                                                                                    CDCA8/CDC20/CENPE/CCNB2/NDC80/SKA1/CENPM/CENPN/CDK1/ERCC6L/MAD2L1/KIF18A/BIRC5/AURKB/CCNB1/KIF2C/PLK1/BUB1B/ZWINT/CENPU/SPC25/CENPI/TAOK1
## 1640170 CDC45/CDCA8/MCM10/CDC20/FOXM1/KIF23/CENPE/MYBL2/CCNB2/NDC80/TOP2A/NCAPH/RRM2/UBE2C/HJURP/SKA1/NEK2/CENPM/CENPN/CCNA2/CDK1/ERCC6L/MAD2L1/GINS1/KIF18A/CDT1/BIRC5/NCAPG/AURKB/GINS2/CHEK1/KIF20A/AURKA/CCNB1/MCM5/PTTG1/LMNB1/MCM2/KIF2C/CDC25A/CDC6/PLK1/BUB1B/ZWINT/CENPU/SPC25/CENPI/ESPL1/RAD51/CCNE1/ORC6/ORC1/OIP5/TAOK1
## 5663220                                                                                                                                                                                                 CDCA8/CDC20/CENPE/NDC80/SKA1/CENPM/CENPN/ERCC6L/MAD2L1/KIF18A/BIRC5/AURKB/KIF2C/PLK1/BUB1B/ZWINT/CENPU/SPC25/CENPI/TAOK1/EVL
## 68886                                                                                                                                           CDCA8/CDC20/KIF23/CENPE/CCNB2/NDC80/NCAPH/UBE2C/SKA1/CENPM/CENPN/CDK1/ERCC6L/MAD2L1/KIF18A/BIRC5/NCAPG/AURKB/KIF20A/CCNB1/PTTG1/KIF2C/PLK1/BUB1B/ZWINT/CENPU/SPC25/CENPI/ESPL1/TAOK1
##         Count
## 68877      25
## 69278      49
## 2500257    23
## 1640170    54
## 5663220    21
## 68886      30

For calculation/parameter details, please refer to the vignette of DOSE.

4.1 Pathway analysis of NGS data

Pathway analysis using NGS data (eg, RNA-Seq and ChIP-Seq) can be performed by linking coding and non-coding regions to coding genes via ChIPseeker package, which can annotates genomic regions to their nearest genes, host genes, and flanking genes respectivly. In addtion, it provides a function, seq2gene, that simultaneously considering host genes, promoter region and flanking gene from intergenic region that may under control via cis-regulation. This function maps genomic regions to genes in a many-to-many manner and facilitate functional analysis. For more details, please refer to ChIPseeker3.

4.2 Visualize enrichment result

We implement barplot, dotplot enrichment map and category-gene-network for visualization. It is very common to visualize the enrichment result in bar or pie chart. We believe the pie chart is misleading and only provide bar chart.

barplot(x, showCategory=8)

dotplot(x, showCategory=15)

Enrichment map can be viusalized by enrichMap:

enrichMap(x, layout=igraph::layout.kamada.kawai, vertex.label.cex = 1)

In order to consider the potentially biological complexities in which a gene may belong to multiple annotation categories, we developed cnetplot function to extract the complex association between genes and diseases.

cnetplot(x, categorySize="pvalue", foldChange=geneList)