Contents


Package: enrichViewNet
Authors: Astrid Deschênes [aut, cre] (https://orcid.org/0000-0001-7846-6749), Pascal Belleau [aut] (https://orcid.org/0000-0002-0802-1071), Robert L. Faure [aut] (https://orcid.org/0000-0003-1798-4723), Maria J. Fernandes [aut] (https://orcid.org/0000-0002-3973-025X), Alexander Krasnitz [aut], David A. Tuveson [aut] (https://orcid.org/0000-0002-8017-2712)
Version: 1.2.0
Compiled date: 2024-04-30
License: Artistic-2.0

1 Licensing

The enrichViewNet package and the underlying enrichViewNet code are distributed under the Artistic license 2.0. You are free to use and redistribute this software.



2 Citing

If you use this package for a publication, we would ask you to cite the following:

Deschênes A, Belleau P, Faure RL, Fernandes MJ, Krasnitz A, Tuveson DA (2023). enrichViewNet: From functional enrichment results to biological networks. doi:10.18129/B9.bioc.enrichViewNet, https://bioconductor.org/packages/enrichViewNet.



3 Introduction

High-throughput technologies are routinely used in basic and applied research and are key drivers of scientific discovery. A major challenge in using these experimental approaches is the analysis of the large amount of data generated. These include lists of proteins or genes generated by mass spectrometry, single-cell RNA sequencing and/or microarray analysis, respectively. There is thus a need for robust bioinformatic and statistical tools that can analyze these large datasets and display the data in the form of networks that illustrate the biological and conceptual links with findings in the literature. This gap has been partially addressed by several bioinformatic tools that perform enrichment analysis of the data and/or present it in the form of networks.

Functional enrichment analysis tools, such as Enrichr (Kuleshov et al. 2016) and DAVID (Dennis et al. 2003), are specialized in positioning novel findings against well curated data sources of biological processes and pathways. Most specifically, those tools identify functional gene sets that are statistically over- (or under-) represented in a gene list (functional enrichment). The traditional output of a significant enrichment analysis tool is a table containing the significant gene sets with their associated statistics. While those results are extremely useful, their interpretation is challenging. The visual representation of these results as a network can greatly facilitate the interpretation of the data.

Biological network models are visual representations of various biological interacting elements which are based on mathematical graphs. In those networks, the biological elements are generally represented by nodes while the interactions and relationships are represented by edges. One of the widely used network tools in the quantitative biology community is the open source software Cytoscape (Shannon et al. 2003). In addition of biological data visualization and network analysis, Cytoscape can be expended through the use of specialized plug-ins such as BiNGO that calculates over-represented GO terms in a network (Maere, Heymans, and Kuiper 2005) or CentiScaPe that identifies relevant network nodes (Scardoni, Petterlini, and Laudanna 2009).

The g:Profiler enrichment analysis tool (Raudvere et al. 2019) is web based and has the particularity of being accompanied by the CRAN package gprofiler2 (Kolberg et al. 2020). The gprofiler2 package gives the opportunity to researchers to incorporate functional enrichment analysis into automated analysis pipelines written in R. This greatly facilitates research reproducibility.

The enrichViewNet package enables the visualization of functional enrichment results as network graphs. Visualization of enriched terms aims to facilitate the analyses of complex results. Compared to popular enrichment visualization graphs such as bar plots and dot plots, network graphs unveil the connection between the terms as significant terms often share one or multiple genes. Moreover, the enrichViewNet package takes advantage of a powerful network visualization tool which is Cytoscape. By doing so, all the functionalities of this mature software can be used to personalize and analyze the enrichment networks.

First, the enrichViewNet package enables the visualization of enrichment results, in a format corresponding to the one generated by gprofiler2, as a customizable Cytoscape network (Shannon et al. 2003). In the biological networks generated by enrichViewNet, both gene datasets (GO terms/pathways/protein complexes) and genes associated to the datasets are represented as nodes. While the edges connect each gene to its dataset(s). Only genes present in the query used for the enrichment analysis are shown.

A network where significant GO terms and genes are presented as nodes while edges connect each gene to its associated term(s).

Figure 1: A network where significant GO terms and genes are presented as nodes while edges connect each gene to its associated term(s)

The enrichViewNet package offers the option to generate a network for only a portion of the significant terms by selecting the source or by providing a specific list of terms.Once the network is created, the user can personalize the visual attributes and integrate external information such as expression profiles, phenotypes and other molecular states. The user can also perform network analysis.

In addition, the enrichViewNet package also provides the option to create enrichment maps from functional enrichment results. The enrichment maps have been introduced in the Bioconductor enrichplot package (Yu 2022). Enrichment maps enable the visualization of enriched terms into a network with edges connecting overlapping genes. Thus, enriched terms with overlapping genes cluster together. This type of graphs facilitate the identification of functional modules.

An enrichment map using significant Kegg terms where edges are connecting terms with overlapping genes.

Figure 2: An enrichment map using significant Kegg terms where edges are connecting terms with overlapping genes

enrichViewNet has been submitted to Bioconductor to aid researchers in carrying out reproducible network analyses using functional enrichment results.



4 Installation

To install this package from Bioconductor, start R (version 4.3 or later) and enter:

if (!requireNamespace("BiocManager", quietly = TRUE))
        install.packages("BiocManager")

BiocManager::install("enrichViewNet")



5 General workflow

The following workflow gives an overview of the capabilities of enrichViewNet:

The enrichViewNet general workflow

Figure 3: The enrichViewNet general workflow

The principal input of enrichViewNet is a functional enrichment result in a format identical to the one generated by the CRAN gprofiler2 package.

From an enrichment result, the enrichViewNet offers two options:

For the gene-term network, the installation of Cytoscape software is highly recommended.



6 Transforming enrichment results into a gene-term network loadable in Cytoscape

The following workflow gives an overview of the steps associated to the creation of an gene-term network loadable in Cytoscape.

From an enrichment list (A) to a Cytoscape network (B).

Figure 4: From an enrichment list (A) to a Cytoscape network (B)

The key steps for the workflow are:

Step Function
Run an enrichment analysis gprofiler2::gost()
Start Cytoscape outside R
Create a gene-term network createNetwork()

The package::function() notation is used for functions from other packages.


6.1 Run an enrichment analysis

The first step consists in running an enrichment analysis with gprofiler2 package. The output of the gprofiler2::gost() is a list and should be saved.

## Required library
library(gprofiler2)

## The dataset of differentially expressed genes done between 
## napabucasin treated and DMSO control parental (Froeling et al 2019)
## All genes tested are present
data("parentalNapaVsDMSODEG")

## Retain significant results 
## (absolute fold change superior to 1 and adjusted p-value inferior to 0.05)
retained <- which(abs(parentalNapaVsDMSODEG$log2FoldChange) > 1 & 
                            parentalNapaVsDMSODEG$padj < 0.05)
signRes <-  parentalNapaVsDMSODEG[retained, ]

## Run one functional enrichment analysis using all significant genes
## The species is homo sapiens ("hsapiens")
## The g:SCS multiple testing correction method (Raudvere U et al 2019)
## The WikiPathways database is used
## Only the significant results are retained (significant=TRUE)
## The evidence codes are included in the results (evcodes=TRUE)
## A custom background included the tested genes is used
gostres <- gprofiler2::gost(
                query=list(parental_napa_vs_DMSO=unique(signRes$EnsemblID)),
                organism="hsapiens",
                correction_method="g_SCS",
                sources=c("WP"),
                significant=TRUE,
                evcodes=TRUE,
                custom_bg=unique(parentalNapaVsDMSODEG$EnsemblID))


The gost() function returns an named list of 2 entries:

  • The result entry contains the enrichment results.
  • The meta entry contains the metadata information.


## The 'gostres' object is a list of 2 entries
## The 'result' entry contains the enrichment results
## The 'meta' entry contains the metadata information

## Some columns of interest in the results
gostres$result[1:4, c("query", "p_value", "term_size", 
                    "query_size", "intersection_size", "term_id")]
##                   query      p_value term_size query_size intersection_size
## 1 parental_napa_vs_DMSO 2.905549e-08        25        157                 7
## 2 parental_napa_vs_DMSO 9.764963e-07        23        157                 6
## 3 parental_napa_vs_DMSO 9.863615e-07       391        157                16
## 4 parental_napa_vs_DMSO 2.065842e-06        13        157                 5
##     term_id
## 1 WP:WP3613
## 2 WP:WP4925
## 3 WP:WP3888
## 4 WP:WP4211
## The term names can be longer than the one shown
gostres$result[19:22, c("term_id", "source", "term_name")]
##      term_id source
## 19 WP:WP2877     WP
## 20 WP:WP5373     WP
## 21  WP:WP516     WP
## 22 WP:WP1742     WP
##                                                                     term_name
## 19                                                 Vitamin D receptor pathway
## 20 Role of hypoxia angiogenesis and FGF pathway in OA chondrocyte hypertrophy
## 21                                                          Hypertrophy model
## 22                                                               TP53 network


6.2 Start Cytoscape

Cytoscape is an open source software for visualizing networks. It enables network integration with any type of attribute data. The Cytoscape software is available at the Cytoscape website.

Cytoscape software logo.

Figure 5: Cytoscape software logo

The Cytoscape network generated by enrichViewNet will be automatically loaded into the Cytoscape software when the application is running.

If the application is not running, a CX JSON file will be created (standard Cytoscape file format). The file can then be loaded manually into the Cytoscape software.


6.3 Create a gene-term network

The gene-term network can be created with the createNetwork() function. If Cytoscape is opened, the network should automatically be loaded in the application. Otherwise, a CX JSON file is created. The CX JSON can be manually be opened in Cytoscape.

The following figure shows what the gene-term network looks like in Cytoscape. As there are numerous significant Reactome terms, the network is a bit hectic.

## Load saved enrichment results between parental Napa vs DMSO
data("parentalNapaVsDMSOEnrichment")

## Create network for REACTOME significant terms
## The 'removeRoot=TRUE' parameter removes the root term from the network
## The network will either by created in Cytoscape (if the application is open)
## or a CX file will be created in the temporary directory
createNetwork(gostObject=parentalNapaVsDMSOEnrichment,  source="REAC", 
        removeRoot=TRUE, title="REACTOME_All", 
        collection="parental_napa_vs_DMSO", 
        fileName=file.path(tempdir(), "parentalNapaVsDMSOEnrichment.cx"))
## [1] TRUE


This is an example of the Reactome network in Cytoscape.

All reactome terms in a gene-term network loaded in Cytoscape.

Figure 6: All reactome terms in a gene-term network loaded in Cytoscape


To address this situation, an updated gene-term network containing only Reactome terms of interest is created.


## Load saved enrichment results between parental Napa vs DMSO
data("parentalNapaVsDMSOEnrichment")

## List of terms of interest
reactomeSelected <- c("REAC:R-HSA-9031628", "REAC:R-HSA-198725", 
                        "REAC:R-HSA-9614085", "REAC:R-HSA-9617828",
                        "REAC:R-HSA-9614657", "REAC:R-HSA-73857",
                        "REAC:R-HSA-74160", "REAC:R-HSA-381340")

## All enrichment results
results <- parentalNapaVsDMSOEnrichment$result

## Retain selected results
selectedRes <- results[which(results$term_id %in% reactomeSelected), ]

## Print the first selected terms
selectedRes[, c("term_name")]
## [1] "NGF-stimulated transcription"                                 
## [2] "Nuclear Events (kinase and transcription factor activation)"  
## [3] "FOXO-mediated transcription"                                  
## [4] "FOXO-mediated transcription of cell death genes"              
## [5] "RNA Polymerase II Transcription"                              
## [6] "Gene expression (Transcription)"                              
## [7] "Transcriptional regulation of white adipocyte differentiation"
## [8] "FOXO-mediated transcription of cell cycle genes"


## Create network for REACTOME selected terms
## The 'source="TERM_ID"' parameter enable to specify a personalized 
## list of terms of interest
## The network will either by created in Cytoscape (if the application is open)
## or a CX file will be created in the temporary directory
createNetwork(gostObject=parentalNapaVsDMSOEnrichment,  source="TERM_ID", 
        termIDs=selectedRes$term_id, title="REACTOME_Selected", 
        collection="parental_napa_vs_DMSO",
        fileName=file.path(tempdir(), "parentalNapaVsDMSO_REACTOME.cx"))
## [1] TRUE


The updated Reactome network in Cytoscape.

Selected Reactome terms in a gene-term network loaded in Cytoscape.

Figure 7: Selected Reactome terms in a gene-term network loaded in Cytoscape


In Cytoscape, the appearance of a network is easily customized. As example, default color and shape for all nodes can be modified. For this example, the nodes have been moved to clarify their relation with the Reactome terms.

Final Reactome network after customization inside Cytoscape.

Figure 8: Final Reactome network after customization inside Cytoscape


The final Reactome network, after customization inside Cytoscape, shows that multiple transcription enriched terms (FOXO-mediated transcription, FOXO-mediated transcription of cell cycle genes, transcription regulation of white adipocyte differentiation, RNA polymerase II transcription and NGF-stimulated transcription terms) are linked through enriched genes.



7 Transforming enrichment results into an enrichment map

The following workflow gives an overview of the steps associated to the creation of an enrichment map.

The key steps for the workflow are:

Step Function
Run an enrichment analysis gprofiler2::gost()
Create an enrichment map createEnrichMap()

The package::function() notation is used for functions from other packages.


7.1 Run an enrichment analysis

The first step consists in running an enrichment analysis with gprofiler2 package. The output of the gprofiler2::gost() is a list and should be saved.

## Required library
library(gprofiler2)

## The dataset of differentially expressed genes done between 
## napabucasin treated and DMSO control parental (Froeling et al 2019)
## All genes tested are present
data("parentalNapaVsDMSODEG")

## Retain significant results 
## (absolute fold change superior to 1 and adjusted p-value inferior to 0.05)
retained <- which(abs(parentalNapaVsDMSODEG$log2FoldChange) > 1 & 
                        parentalNapaVsDMSODEG$padj < 0.05)
signRes <- parentalNapaVsDMSODEG[retained, ]

## Run one functional enrichment analysis using all significant genes
## The species is homo sapiens ("hsapiens")
## The g:SCS multiple testing correction method (Raudvere U et al 2019)
## The WikiPathways database is used
## Only the significant results are retained (significant=TRUE)
## The evidence codes are included in the results (evcodes=TRUE)
## A custom background included the tested genes is used
gostres <- gprofiler2::gost(
                query=list(parental_napa_vs_DMSO=unique(signRes$EnsemblID)),
                organism="hsapiens",
                correction_method="g_SCS",
                sources=c("WP"),
                significant=TRUE,
                evcodes=TRUE,
                custom_bg=unique(parentalNapaVsDMSODEG$EnsemblID))


The gost() function returns an named list of 2 entries:

  • The result entry contains the enrichment results.
  • The meta entry contains the metadata information.


## The 'gostres' object is a list of 2 entries
## The 'result' entry contains the enrichment results
## The 'meta' entry contains the metadata information

## Some columns of interest in the results
gostres$result[1:4, c("query", "p_value", "term_size", 
                    "query_size", "intersection_size", "term_id")]
##                   query      p_value term_size query_size intersection_size
## 1 parental_napa_vs_DMSO 2.905549e-08        25        157                 7
## 2 parental_napa_vs_DMSO 9.764963e-07        23        157                 6
## 3 parental_napa_vs_DMSO 9.863615e-07       391        157                16
## 4 parental_napa_vs_DMSO 2.065842e-06        13        157                 5
##     term_id
## 1 WP:WP3613
## 2 WP:WP4925
## 3 WP:WP3888
## 4 WP:WP4211
## The term names can be longer than the one shown
gostres$result[19:22, c("term_id", "source", "term_name")]
##      term_id source
## 19 WP:WP2877     WP
## 20 WP:WP5373     WP
## 21  WP:WP516     WP
## 22 WP:WP1742     WP
##                                                                     term_name
## 19                                                 Vitamin D receptor pathway
## 20 Role of hypoxia angiogenesis and FGF pathway in OA chondrocyte hypertrophy
## 21                                                          Hypertrophy model
## 22                                                               TP53 network


7.2 Create an enrichment map

The enrichment map can be created with the createEnrichMap() function. The function generates a ggplot object.

In an enrichment map, terms with overlapping significant genes tend to cluster together. The Jaccard correlation coefficient is used as a metric of similarity. Terms with high similarity (similarity metric > 0.2) are linked together by edges. The edges are shorter and thicker when the similarity metric is high.

## Load saved enrichment results between parental Napa vs DMSO
data(parentalNapaVsDMSOEnrichment)

## Set seed to ensure reproducible results
set.seed(121)

## Create network for all Kegg terms
createEnrichMap(gostObject=parentalNapaVsDMSOEnrichment, 
                    query="parental_napa_vs_DMSO", 
                    source="KEGG")
A Kegg enrichment map where terms with overlapping significant genes cluster together.

Figure 9: A Kegg enrichment map where terms with overlapping significant genes cluster together


The Kegg enrichment map shows that the MAPK signaling pathway term is highly influential in the network. In addition, the Transcriptional misregulation in cancer term is the only isolated node.


7.2.1 Using list of term IDs

Instead of using a specific source, the enrichment map can be generated from a personalized array of term IDs. To do so, the source parameter must be set to “TERM_ID” and an array of term IDs must be passed to the termIDs parameter. The terms can be pick-up from different sources. If more than one terms have the same description, the term ID will be added at the end of the description for clarity.

## Load saved enrichment results between parental Napa vs DMSO
data(parentalNapaVsDMSOEnrichment)

## The term IDs must correspond to the IDs present in the "term_id" column
head(parentalNapaVsDMSOEnrichment$result[, c("query", "term_id", "term_name")], 
     n=3)
##                   query    term_id                                 term_name
## 1 parental_napa_vs_DMSO GO:0048523   negative regulation of cellular process
## 2 parental_napa_vs_DMSO GO:0008219                                cell death
## 3 parental_napa_vs_DMSO GO:0048519 negative regulation of biological process
## List of selected terms from different sources
termID <- c("KEGG:04115", "WP:WP4963", "KEGG:04010", 
                "REAC:R-HSA-5675221", "REAC:R-HSA-112409", "WP:WP382")

## Set seed to ensure reproducible results
set.seed(222)

## Create network for all selected terms
createEnrichMap(gostObject=parentalNapaVsDMSOEnrichment, 
                    query="parental_napa_vs_DMSO", 
                    source="TERM_ID", termIDs=termID)
An enrichment map showing only the user selected terms.

Figure 10: An enrichment map showing only the user selected terms


As the description “MAPK signaling pathway” is present twice, the ID of each term has been added to the end of the description. Hence, the MAPK pathway from WikiPathway can be distinguished from the Kegg pathway.


7.2.2 Effect of seed value

The layering of the nodes is not always optimal. As the layering is affected by the seed value, you might want to test few seed values, using the set.seed() function, before selecting the final graph.

## Set seed to ensure reproducible results
set.seed(91)

## Create network for all Kegg terms
createEnrichMap(gostObject=parentalNapaVsDMSOEnrichment, 
                    query="parental_napa_vs_DMSO", 
                    source="KEGG")
An enrichment map with a different seed.

Figure 11: An enrichment map with a different seed


7.2.3 Enrichment map customization

The output of the createEnrichMap() function is a ggplot object. This means that the graph can be personalized. For example, the default colors can be changed:

## The ggplot2 library is required
library(ggplot2)

## Set seed to ensure reproducible results
set.seed(91)

## Create network for all Kegg terms
graphKegg <- createEnrichMap(gostObject=parentalNapaVsDMSOEnrichment, 
                    query="parental_napa_vs_DMSO", 
                    source="KEGG")

## Nodes with lowest p-values will be in orange and highest p-values in black
## The title of the legend is also modified
graphKegg + scale_fill_gradient(name="P-value adjusted", low="orange", 
                                    high="black")
An enrichment map with personalized colors.

Figure 12: An enrichment map with personalized colors



8 Transforming enrichment results into an enrichment map with groups from different enrichment analyses

The following workflow gives an overview of the steps associated to the creation of an enrichment map with groups. Groups can be created from the same enrichment analysis or can come from different enrichment analyses. This section is specifically for enrichment maps created using multiple enrichment analyses.

The key steps for the workflow are:

Step Function
Run multiple enrichment analyses gprofiler2::gost()
Create an enrichment map with groups createEnrichMapMultiBasic()

The package::function() notation is used for functions from other packages.


The first step has been presented in the previous section.


8.1 Create an enrichment map using multiple enrichment analyses

An enrichment map can show enrichment results from multiple enrichment analyses. A different color is used for each analysis. This enables to highlight the similarities and differences between the analyses.

## Set seed to ensure reproducible results
set.seed(2121)

## The dataset of functional enriched terms for two experiments:
## napabucasin treated and DMSO control parental and
## napabucasin treated and DMSO control expressing Rosa26 control vector
## (Froeling et al 2019)
data("parentalNapaVsDMSOEnrichment")
data("rosaNapaVsDMSOEnrichment")

## The gostObjectList is a list containing all 
## the functional enrichment objects
gostObjectList <- list(parentalNapaVsDMSOEnrichment, 
    rosaNapaVsDMSOEnrichment)

## The queryList is a list of query names retained for each of the enrichment 
## object (same order). Beware that a enrichment object can contain more than 
## one query.
query_01 <- unique(parentalNapaVsDMSOEnrichment$result$query)[1]
query_02 <- unique(rosaNapaVsDMSOEnrichment$result$query)[1]
queryList <- list(query_01, query_02)

## Enrichment map where the groups are the KEGG results for the 2 different
## experiments
createEnrichMapMultiBasic(gostObjectList=gostObjectList,
    queryList=queryList, source="KEGG", removeRoot=TRUE)
An enrichment map containing Kegg enrichment results for 2 different experiments.

Figure 13: An enrichment map containing Kegg enrichment results for 2 different experiments


There are 4 KEGG enrichment terms that are shared by the 2 experiments. In additions, the Viral carinogenesis term is the only term that is not related to the other KEGG terms.


8.2 Effect of seed value

The layering of the nodes is not always optimal. As the layering is affected by the seed value, you might want to test few seed values, using the set.seed() function, before selecting the final graph.

## Set seed to ensure reproducible results
set.seed(5)

## Enrichment map where the groups are the KEGG results for the 2 different
## experiments
createEnrichMapMultiBasic(gostObjectList=gostObjectList,
    queryList=queryList, source="KEGG", removeRoot=TRUE)
An enrichment map using the same data fromt the previous one but with a different seed.

Figure 14: An enrichment map using the same data fromt the previous one but with a different seed


8.3 Enrichment map customization

The output of the createEnrichMapMultiBasic() function is a ggplot object. This means that the graph can be personalized. For example, the default colors, legend title and legend font can be changed:

## Required library
library(ggplot2)

## Set seed to ensure reproducible results
set.seed(5)

## Enrichment map where the groups are the KEGG results for the 2 different
## experiments
createEnrichMapMultiBasic(gostObjectList=gostObjectList,
    queryList=queryList, source="KEGG", removeRoot=TRUE) +
        scale_fill_manual(name="Groups",
                breaks = queryList,
                values = c("cyan4", "bisque3"),
                labels = c("parental", "rosa")) +
        theme(legend.title = element_text(face="bold"))
An enrichment map using KEGG terms from two enrichment analyses with personalized colors and legend.

Figure 15: An enrichment map using KEGG terms from two enrichment analyses with personalized colors and legend



9 Transforming enrichment results into an enrichment map with groups from same enrichment analysis or from complex designs

The following workflow gives an overview of the steps associated to the creation of an enrichment map with groups. Groups can be created from the same enrichment analysis or can come from different enrichment analyses. This section is specifically for enrichment maps created using complex designs.

The key steps for the workflow are:

Step Function
Run one ro multiple enrichment analyses gprofiler2::gost()
Create an enrichment map with group createEnrichMapMultiComplex()

The package::function() notation is used for functions from other packages.


The first step has been presented in the previous section.


9.1 Create an enrichment map using mutliple subsections of one enrichment analysis

The createEnrichMapMultiComplex() function enables enrichment map to be generate with multiple groups from one enrichment analysis. As an example, terms from different sources can be drawn in different colors.

## Set seed to ensure reproducible results
set.seed(3221)

## The dataset of functional enriched terms for one experiment:
## napabucasin treated and DMSO control expressing Rosa26 control vector
## (Froeling et al 2019)
data("rosaNapaVsDMSOEnrichment")

## The gostObjectList is a list containing all 
## the functional enrichment objects
## In this case, the same enrichment object is used twice
gostObjectList <- list(rosaNapaVsDMSOEnrichment, 
    rosaNapaVsDMSOEnrichment)

## Extract the query name from the enrichment object
query_01 <- unique(rosaNapaVsDMSOEnrichment$result$query)[1]

## The query information is a data frame containing the information required 
##   to extract the specific terms for each enrichment object.
## The number of rows must correspond to the number of enrichment objects/
## The query name must be present in the enrichment object.
## The source can be: "GO:BP" for Gene Ontology Biological Process, 
##   "GO:CC" for Gene Ontology Cellular Component, "GO:MF" for Gene Ontology 
##   Molecular Function, "KEGG" for Kegg, "REAC" for Reactome, 
##   "TF" for TRANSFAC, "MIRNA" for miRTarBase, "CORUM" for CORUM database, 
##   "HP" for Human phenotype ontology and "WP" for WikiPathways or 
##   "TERM_ID" when a list of terms is specified.
## The termsIDs is an empty string except when the source is set to "TERM_ID".
## The group names are going to be used in the legend and should be unique to 
##  each group.
queryInfo <- data.frame(queryName=c(query_01, query_01), 
                            source=c("KEGG", "REAC"),
                            removeRoot=c(TRUE, TRUE),
                            termIDs=c("", ""), 
                            groupName=c("Kegg", "Reactome"),
                            stringsAsFactors=FALSE)

## Enrichment map where the groups are the KEGG and Reactome results for the 
## same experiment
createEnrichMapMultiComplex(gostObjectList=gostObjectList, 
        queryInfo=queryInfo)
An enrichment map containing Kegg and Reactome results from the rosa Napa vs DMSO analysis.

Figure 16: An enrichment map containing Kegg and Reactome results from the rosa Napa vs DMSO analysis


The significant genes are overlapping between the Kegg and Reactome pathways as shown by the connections between the pathways. The only isolated term is the Kegg Viral carcinogenesis pathway.


9.2 Create an enrichment map using a complex design

Complex designs can be used with createEnrichMapMultiComplex(). Results from different analyses as well as split results from the same analysis can be showcase together.

In this example, the enrichment map will contain 4 groups from 2 experiments: - MAP kinases related terms in rosa napabucasin treated vs DMSO control - MAP kinases related terms in parental napabucasin treated vs DMSO control - Interleukin related terms in rosa napabucasin treated vs DMSO control - Interleukin related terms in parental napabucasin treated vs DMSO control

## Set seed to ensure reproducible results
set.seed(28)

## The datasets of functional enriched terms for the two experiments:
## napabucasin treated and DMSO control expressing Rosa26 control vector and
## napabucasin treated and DMSO control parental MiaPaCa2 cells
## (Froeling et al 2019)
data("rosaNapaVsDMSOEnrichment")
data("parentalNapaVsDMSOEnrichment")

## The gostObjectList is a list containing all 
## the functional enrichment objects
## In this case, the same enrichment object is used twice
## The order of the objects must respect the order on the queryInfo data frame
## In this case: 
##   1. rosa dataset (for MAP kinases)
##   2. parental dataset (for MAP kinases)
##   3. rosa dataset (for interleukin)
##   4. parental dataset (for interleukin)
gostObjectList <- list(rosaNapaVsDMSOEnrichment, parentalNapaVsDMSOEnrichment, 
    rosaNapaVsDMSOEnrichment, parentalNapaVsDMSOEnrichment)

## Extract the query name from the enrichment object
query_rosa <- unique(rosaNapaVsDMSOEnrichment$result$query)[1]
query_parental <- unique(parentalNapaVsDMSOEnrichment$result$query)[1]


## List of selected terms that will be shown in each group
rosa_mapk <- "GO:0017017,GO:0033549,KEGG:04010,WP:WP382"
rosa_il <- "KEGG:04657,WP:WP4754" 
parental_mapk <- paste0("GO:0017017,GO:0033549,KEGG:04010,", 
                            "REAC:R-HSA-5675221,REAC:R-HSA-112409,WP:WP382")
parental_il <- "WP:WP4754,WP:WP395"


## The query information is a data frame containing the information required 
## to extract the specific terms for each enrichment object
## The number of rows must correspond to the number of enrichment objects
## The query name must be present in the enrichment object
## The source is set to "TERM_ID" so that the terms present in termIDs column 
##  will be used
## The group name will be used for the legend, the same name cannot be 
##  used twice
queryInfo <- data.frame(queryName=c(query_rosa, query_parental, 
                    query_rosa, query_parental), 
        source=c("TERM_ID", "TERM_ID", "TERM_ID", "TERM_ID"),
        removeRoot=c(FALSE, FALSE, FALSE, FALSE),
        termIDs=c(rosa_mapk, parental_mapk, rosa_il, parental_il),
        groupName=c("rosa - MAP kinases", "parental - MAP kinases", 
                        "rosa - Interleukin", "parental - Interleukin"),
        stringsAsFactors=FALSE)

## Enrichment map where the groups TODO
createEnrichMapMultiComplex(gostObjectList=gostObjectList, 
        queryInfo=queryInfo)
An enrichment map using selected terms related to MAP kinases and interleukin in two different experiments.

Figure 17: An enrichment map using selected terms related to MAP kinases and interleukin in two different experiments

While the two experiments have a lot of similar enriched terms, there are still terms that are unique to each experiment.



10 Acknowledgments

The differentially expressed genes between napabucasin-treated cells (0.5 uM) and DMSO as vehicle control are reprinted from Clinical Cancer Research, 2019, 25 (23), 7162–7174, Fieke E.M. Froeling, Manojit Mosur Swamynathan, Astrid Deschênes, Iok In Christine Chio, Erin Brosnan, Melissa A. Yao, Priya Alagesan, Matthew Lucito, Juying Li, An-Yun Chang, Lloyd C. Trotman, Pascal Belleau, Youngkyu Park, Harry A. Rogoff, James D. Watson, David A. Tuveson, Bioactivation of napabucasin triggers reactive oxygen species–mediated cancer cell death, with permission from AACR.

Robert L. Faure is also supported by the National Sciences Engineering Research Council of Canada (NSERCC): 155751-1501.



11 Session info

Here is the output of sessionInfo() on the system on which this document was compiled:

## R version 4.4.0 beta (2024-04-15 r86425)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 22.04.4 LTS
## 
## Matrix products: default
## BLAS:   /home/biocbuild/bbs-3.19-bioc/R/lib/libRblas.so 
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: America/New_York
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] ggplot2_3.5.1       gprofiler2_0.2.3    enrichViewNet_1.2.0
## [4] knitr_1.46          BiocStyle_2.32.0   
## 
## loaded via a namespace (and not attached):
##   [1] RColorBrewer_1.1-3      jsonlite_1.8.8          magrittr_2.0.3         
##   [4] magick_2.8.3            farver_2.1.1            rmarkdown_2.26         
##   [7] fs_1.6.4                zlibbioc_1.50.0         vctrs_0.6.5            
##  [10] memoise_2.0.1           RCurl_1.98-1.14         ggtree_3.12.0          
##  [13] base64enc_0.1-3         tinytex_0.50            htmltools_0.5.8.1      
##  [16] curl_5.2.1              gridGraphics_0.5-1      strex_2.0.0            
##  [19] sass_0.4.9              KernSmooth_2.23-22      bslib_0.7.0            
##  [22] htmlwidgets_1.6.4       plyr_1.8.9              plotly_4.10.4          
##  [25] cachem_1.0.8            uuid_1.2-0              igraph_2.0.3           
##  [28] lifecycle_1.0.4         pkgconfig_2.0.3         Matrix_1.7-0           
##  [31] R6_2.5.1                fastmap_1.1.1           GenomeInfoDbData_1.2.12
##  [34] digest_0.6.35           aplot_0.2.2             enrichplot_1.24.0      
##  [37] ggnewscale_0.4.10       colorspace_2.1-0        patchwork_1.2.0        
##  [40] AnnotationDbi_1.66.0    S4Vectors_0.42.0        RSQLite_2.3.6          
##  [43] base64url_1.4           labeling_0.4.3          fansi_1.0.6            
##  [46] RJSONIO_1.3-1.9         httr_1.4.7              polyclip_1.10-6        
##  [49] compiler_4.4.0          bit64_4.0.5             withr_3.0.0            
##  [52] backports_1.4.1         BiocParallel_1.38.0     viridis_0.6.5          
##  [55] RCy3_2.24.0             DBI_1.2.2               highr_0.10             
##  [58] ggforce_0.4.2           gplots_3.1.3.1          MASS_7.3-60.2          
##  [61] HDO.db_0.99.1           gtools_3.9.5            caTools_1.18.2         
##  [64] tools_4.4.0             scatterpie_0.2.2        ape_5.8                
##  [67] glue_1.7.0              nlme_3.1-164            GOSemSim_2.30.0        
##  [70] shadowtext_0.1.3        grid_4.4.0              checkmate_2.3.1        
##  [73] pbdZMQ_0.3-11           reshape2_1.4.4          fgsea_1.30.0           
##  [76] generics_0.1.3          gtable_0.3.5            tidyr_1.3.1            
##  [79] data.table_1.15.4       tidygraph_1.3.1         utf8_1.2.4             
##  [82] XVector_0.44.0          BiocGenerics_0.50.0     ggrepel_0.9.5          
##  [85] pillar_1.9.0            stringr_1.5.1           yulab.utils_0.1.4      
##  [88] IRdisplay_1.1           splines_4.4.0           dplyr_1.1.4            
##  [91] tweenr_2.0.3            treeio_1.28.0           lattice_0.22-6         
##  [94] bit_4.0.5               tidyselect_1.2.1        GO.db_3.19.1           
##  [97] Biostrings_2.72.0       gridExtra_2.3           bookdown_0.39          
## [100] IRanges_2.38.0          stats4_4.4.0            xfun_0.43              
## [103] graphlayouts_1.1.1      Biobase_2.64.0          stringi_1.8.3          
## [106] UCSC.utils_1.0.0        lazyeval_0.2.2          ggfun_0.1.4            
## [109] yaml_2.3.8              evaluate_0.23           codetools_0.2-20       
## [112] ggraph_2.2.1            tibble_3.2.1            qvalue_2.36.0          
## [115] BiocManager_1.30.22     graph_1.82.0            ggplotify_0.1.2        
## [118] cli_3.6.2               IRkernel_1.3.2          repr_1.1.7             
## [121] munsell_0.5.1           jquerylib_0.1.4         Rcpp_1.0.12            
## [124] GenomeInfoDb_1.40.0     png_0.1-8               XML_3.99-0.16.1        
## [127] parallel_4.4.0          blob_1.2.4              DOSE_3.30.0            
## [130] bitops_1.0-7            viridisLite_0.4.2       tidytree_0.4.6         
## [133] scales_1.3.0            purrr_1.0.2             crayon_1.5.2           
## [136] rlang_1.1.3             cowplot_1.1.3           fastmatch_1.1-4        
## [139] KEGGREST_1.44.0



References

Dennis, Glynn, Brad T. Sherman, Douglas A. Hosack, Jun Yang, Wei Gao, H. Clifford Lane, and Richard A. Lempicki. 2003. “DAVID: Database for Annotation, Visualization, and Integrated Discovery.” Genome Biology 4 (5). https://doi.org/10.1186/gb-2003-4-9-r60.

Kolberg, Liis, Uku Raudvere, Ivan Kuzmin, Jaak Vilo, and Hedi Peterson. 2020. “gprofiler2 – an R package for gene list functional enrichment analysis and namespace conversion toolset g:Profiler.” F1000Research 9: 1–27. https://doi.org/10.12688/f1000research.24956.1.

Kuleshov, Maxim V., Matthew R. Jones, Andrew D. Rouillard, Nicolas F. Fernandez, Qiaonan Duan, Zichen Wang, Simon Koplev, et al. 2016. “Enrichr: a comprehensive gene set enrichment analysis web server 2016 update.” Nucleic Acids Research 44 (W1): W90–W97. https://doi.org/10.1093/nar/gkw377.

Maere, Steven, Karel Heymans, and Martin Kuiper. 2005. “BiNGO: a Cytoscape plugin to assess overrepresentation of Gene Ontology categories in Biological Networks.” Bioinformatics 21 (16): 3448–9. https://doi.org/10.1093/bioinformatics/bti551.

Raudvere, Uku, Liis Kolberg, Ivan Kuzmin, Tambet Arak, Priit Adler, Hedi Peterson, and Jaak Vilo. 2019. “G:Profiler: A web server for functional enrichment analysis and conversions of gene lists (2019 update).” Nucleic Acids Research 47 (W1): W191–W198. https://doi.org/10.1093/nar/gkz369.

Scardoni, Giovanni, Michele Petterlini, and Carlo Laudanna. 2009. “Analyzing biological network parameters with CentiScaPe.” Bioinformatics 25 (21): 2857–9. https://doi.org/10.1093/bioinformatics/btp517.

Shannon, Paul, Andrew Markiel, Owen Ozier, Nitin S Baliga, Jonathan T Wang, Daniel Ramage, Nada Amin, Benno Schwikowski, and Trey Ideker. 2003. “Cytoscape: a software environment for integrated models of biomolecular interaction networks.” Genome Research 13 (11): 2498–2504. https://doi.org/10.1101/gr.1239303.

Yu, Guangchuang. 2022. Enrichplot: Visualization of Functional Enrichment Result. https://yulab-smu.top/biomedical-knowledge-mining-book/.