Obtain Input
Download
: Get GEO datagetDataGEO
: Search by cancer type and data type [Gene Expression]FEA
: Functional Enrichment AnalysisFEAplot
: Functional Enrichment Analysis PlotGRN
: Gene Regulatory NetworkURA
: Upstream Regulator AnalysisPRA
: Pattern Regognition AnalysisDMA
: Driver Mutation AnalysisGLS
: Gene Literature SearchLevel of consequence
: Effect of mutations on three different levelsplotNetworkHive
: GRN hive visualization taking into account COSMIC cancer genesplotDMA
: Heatmap of the driver/passenger status of mutations in TSGs/OCGsplotMoonlight
: Heatmap of Moonlight gene z-score for the TSGs/OCGsIn order to make light of cancer development, it is crucial to understand which genes play a role in the mechanisms linked to this disease and moreover which role that is. Commonly biological processes such as proliferation and apoptosis have been linked to cancer progression. We have developed the Moonlight framework that allows for prediction of cancer driver genes through multi-omics data integration. Based on expression data we perform functional enrichment analysis, infer gene regulatory networks and upstream regulator analysis to score the importance of well-known biological processes with respect to the studied cancer. We then use these scores to predict oncogenic mediators with two specific roles: genes that potentially act as tumor suppressor genes (TSGs) and genes that potentially act as oncogenes (OCGs). This constitutes Moonlight’s primary layer. As gene expression data alone does not explain the cancer phenotypes, a second layer of evidence is needed. We have automated the integration of a secondary mutational layer that predicts driver mutations in the oncogenic mediators and thereby allows for the prediction of cancer driver genes using the driver mutation prediction tool CScape-somatic. These new functionalities are provided in the updated version of Moonlight, namely Moonlight2. Overall, this methodology not only allows us to identify genes with dual role (TSG in one cancer type and OCG in another) but also to elucidate the underlying biological processes.
Cancer development is influenced by mutations in two distinctly different categories of genes, known as tumor suppressor genes (TSG) and oncogenes (OCG). The occurrence of mutations in genes of the first category leads to faster cell proliferation while mutations in genes of second category increases or changes their function. In 2020, we developed the Moonlight framework that allows for prediction of cancer driver genes (Colaprico, Antonio and Olsen, Catharina and Bailey, Matthew H. and Odom, Gabriel J. and Terkelsen, Thilde and Silva, Tiago C. and Olsen, André V. and Cantini, Laura and Zinovyev, Andrei and Barillot, Emmanuel and Noushmehr, Houtan and Bertoli, Gloria and Castiglioni, Isabella and Cava, Claudia and Bontempi, Gianluca and Chen, Xi Steven and Papaleo, Elena 2020). Here, gene expression data are integrated together with biological processes and gene regulatory networks to score the importance of well-known biological processes with respect to the studied cancer. These scores are used to predict oncogenic mediators: putative TSGs and putative OCGs. As gene expression data alone is not enough to explain the deregulation of the genes, a second layer of evidence is needed. For this reason, we automated the integration of a secondary mutational layer which predicts driver mutations and passenger mutations in the oncogenic mediators. These new functionalities are released in the updated version of Moonlight to produce Moonlight2R. The prediction of the driver mutations are carried out using the CScape-somatic driver mutation prediction tool. Moreover, the new functionalities estimate the potential effect of a mutation on the transcriptional, translational, or protein structure/function level. Those oncogenic mediators with at least one driver mutation are retained as the final set of driver genes (Nourbakhsh, Mona and Saksager, Astrid and Tom, Nikola and Chen, Xi Steven and Colaprico, Antonio and Olsen, Catharina and Tiberti, Matteo and Papaleo, Elena 2023).
Moonlight’s pipeline is shown below:
The proposed pipeline consists of following eight steps:
To install Moonlight2R use the code below.
First, install devtools
or if you already have it installed, load it.
Install Moonlight2R from GitHub:
First, install the BiocStyle Bioconductor package.
if (!require("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("BiocStyle")
Then install Moonlight2R with its accompanying vignette.
You can view the vignette in the following way.
## Loading required package: doParallel
## Loading required package: foreach
## Loading required package: iterators
## Loading required package: parallel
##
##
## Setting options('download.file.method.GEOquery'='auto')
## Setting options('GEOquery.inmemory.gpl'=FALSE)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
Obtain Input
The input to Moonlight is a set of differentially expressed genes and gene expression and mutation data are also needed. Gene expression data, mutation data and differentially expressed genes can for example be obtained from TCGA using the R package TCGAbiolinks. Help documents on how to use TCGAbiolinks are available here. To find other examples of usage of TCGAbiolinks on TCGA cancer types see our GitHub repository. Example data of the input (differentially expressed genes, gene expression data, and mutation data) are stored in the Moonlight2R package:
Download
: Get GEO dataYou can search GEO data using the getDataGEO
function.
GEO_TCGAtab: a 18x12 matrix that provides the GEO data set we matched to one of the 18 given TCGA cancer types
knitr::kable(GEO_TCGAtab, digits = 2,
caption = "Table with GEO data set matched to one
of the 18 given TCGA cancer types ",
row.names = TRUE)
Cancer | TP | NT | DEG. | Dataset | TP.1 | NT.1 | Platform | DEG.. | Common | GEO_Normal | GEO_Tumor | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | BLCA | 408 | 19 | 2937 | GSE13507 | 165 | 10 | GPL65000 | 2099 | 896 | control | cancer |
2 | BRCA | 1097 | 114 | 3390 | GSE39004 | 61 | 47 | GPL6244 | 2449 | 1248 | normal | Tumor |
3 | CHOL | 36 | 9 | 5015 | GSE26566 | 104 | 59 | GPL6104 | 3983 | 2587 | Surrounding | Tumor |
4 | COAD | 286 | 41 | 3788 | GSE41657 | 25 | 12 | GPL6480 | 3523 | 1367 | N | A |
5 | ESCA | 184 | 11 | 2525 | GSE20347 | 17 | 17 | GPL571 | 1316 | 406 | normal | carcinoma |
6 | GBM | 156 | 5 | 4828 | GSE50161 | 34 | 13 | GPL570 | 4504 | 2660 | normal | GBM |
7 | HNSC | 520 | 44 | 2973 | GSE6631 | 22 | 22 | GPL8300 | 142 | 129 | normal | cancer |
8 | KICH | 66 | 25 | 4355 | GSE15641 | 6 | 23 | GPL96 | 1789 | 680 | normal | chromophobe |
9 | KIRC | 533 | 72 | 3618 | GSE15641 | 32 | 23 | GPL96 | 2911 | 939 | normal | clear cell RCC |
10 | KIRP | 290 | 32 | 3748 | GSE15641 | 11 | 23 | GPL96 | 2020 | 756 | normal | papillary RCC |
11 | LIHC | 371 | 50 | 3043 | GSE45267 | 46 | 41 | GPL570 | 1583 | 860 | normal liver | HCC sample |
12 | LUAD | 515 | 59 | 3498 | GSE10072 | 58 | 49 | GPL96 | 666 | 555 | normal | tumor |
13 | LUSC | 503 | 51 | 4984 | GSE33479 | 14 | 27 | GPL6480 | 3729 | 1706 | normal | squamous cell carcinoma |
14 | PRAD | 497 | 52 | 1860 | GSE6919 | 81 | 90 | GPL8300 | 246 | 149 | normal prostate | tumor samples |
15 | READ | 94 | 10 | 3628 | GSE20842 | 65 | 65 | GPL4133 | 2172 | 1261 | M | T |
16 | STAD | 415 | 35 | 2622 | GSE2685 | 10 | 10 | GPL80 | 487 | 164 | N | T |
17 | THCA | 505 | 59 | 1994 | GSE33630 | 60 | 45 | GPL570 | 1451 | 781 | N | T |
18 | UCEC | 176 | 24 | 4183 | GSE17025 | GPL570 | tp | lcm |
getDataGEO
: Search by cancer type and data type [Gene Expression]The user can query and download the cancer types supported by GEO, using the function getDataGEO
:
FEA
: Functional Enrichment AnalysisThe user can perform a functional enrichment analysis using the function FEA
.
For each DEG in the gene set a z-score is calculated. This score indicates how the genes act in the gene set.
The output can be visualized with a FEA plot.
FEAplot
: Functional Enrichment Analysis PlotThe user can plot the result of a functional enrichment analysis using the function plotFEA
.
A negative z-score indicates that the process’ activity is decreased. A positive z-score
indicates that the process’ activity is increased.
The figure generated by the above code is shown below:
GRN
: Gene Regulatory NetworkThe user can perform a gene regulatory network analysis using the function
GRN
which infers the network using the parmigene package. For illustrative
purposes and to decrease runtime, we have set nGenesPerm = 5
and nBoot = 5
in the example below, however, we recommend setting these parameters to
nGenesPerm = 2000
and nBoot = 400
to achieve optimal results, as they are
set by default in the function arguments.
URA
: Upstream Regulator AnalysisThe user can perform upstream regulator analysis using the function URA
. This function is
applied to each DEG in the enriched gene set and its neighbors in the GRN.
PRA
: Pattern Regognition AnalysisThe user can retrieve TSG/OCG candidates using either selected biological processes or a random forest classifier trained on known COSMIC OCGs/TSGs.
DMA
: Driver Mutation AnalysisThe user can identify driver mutations with DMA
in the oncogenic mediators established by PRA
.
The passenger or driver status is estimated with CScape-somatic.
This function will further generate three files: DEG_Mutations_Annotations.rda,
Oncogenic_mediators_mutation_summary.rda and cscape_somatic_output.rda. These will be placed
in the specified results-folder.
The user needs to download two CScape-somatic files in order to run DMA named css_coding.vcf.gz
and css_noncoding.vcf.gz, respectively. These two files can be downloaded at
http://cscape-somatic.biocompute.org.uk/#download. The corresponding .tbi files (css_coding.vcf.gz.tbi
and css_noncoding.vcf.gz.tbi) must also be downloaded and be placed in the same folder.
data(dataPRA)
data(dataMAF)
data(DEGsmatrix)
data(LOC_transcription)
data(LOC_translation)
data(LOC_protein)
data(NCG)
data(EncodePromoters)
dataDMA <- DMA(dataMAF = dataMAF,
dataDEGs = DEGsmatrix,
dataPRA = dataPRA,
results_folder = "DMAresults",
coding_file = "css_coding.vcf.gz",
noncoding_file = "css_noncoding.vcf.gz")
GLS
: Gene Literature SearchThe user can perform a literature search on driver genes predicted from DMA
using the GLS
function. This function takes as input driver genes,
a query and maximum number of records to retrieve from PubMed.
Standard PubMed syntax can be used in the query. For example, Boolean
operators AND, OR, NOT can be applied and tags such as [AU],
[TITLE/ABSTRACT], [Affiliation] can be used. GLS
fetches data of
PubMed records matching the specified query and outputs PubMed IDs
matching the query along with doi, title, abstract, year of publication,
keywords, and total number of PubMed publications. This is done for each
of the genes supplied in the input.
data(dataDMA)
genes_query <- Reduce(c, dataDMA)
dataGLS <- GLS(genes = genes_query,
query_string = "AND cancer AND driver",
max_records = 20)
## Processing PubMed data .............. done!
## # A tibble: 6 × 8
## pmid gene doi title abstract year keywords pubmed_count
## <chr> <chr> <chr> <chr> <chr> <chr> <chr> <dbl>
## 1 36831452 ABCG2 10.3390/cancers1504… Hypo… Epithel… 2023 4-hydro… 14
## 2 35956408 ABCG2 10.3390/nu14153232 Sele… Cisplat… 2022 AMPK; c… 14
## 3 35427871 ABCG2 10.1016/j.drup.2022… Clin… Small-m… 2022 NSCLC; … 14
## 4 34255797 ABCG2 10.1371/journal.pon… Meta… Abcg2/B… 2021 ATP Bin… 14
## 5 34065402 ABCG2 10.3390/ijms22105384 KRAS… Lung ca… 2021 ABC dru… 14
## 6 33455278 ABCG2 10.1021/acsbiomater… SOX2… Certain… 2021 fibroge… 14
Level of consequence
: Effect of mutations on three different levelsThe user can investigate the predicted effect of different mutation types on the transcriptional level through the table LOC_transcription:
Variant_Classification | SNP | INS | DEL | DNP | TNP | ONP |
---|---|---|---|---|---|---|
5’Flank | 1 | 1 | 1 | 1 | 1 | 1 |
5’UTR | 1 | 1 | 1 | 1 | 1 | 1 |
Translation_Start_Site | 0 | 0 | 0 | 0 | 0 | 0 |
Missense_Mutation | 0 | NA | NA | 0 | 0 | 0 |
Nonsense_Mutation | 1 | 1 | 1 | 1 | 1 | 1 |
Nonstop_Mutation | 0 | 0 | 0 | 0 | 0 | 0 |
Splice_Site | 1 | 1 | 1 | 1 | 1 | 1 |
Splice_Region | 1 | 1 | 1 | 1 | 1 | 1 |
Silent | 0 | NA | NA | 0 | 0 | 0 |
In_Frame_Del | NA | 0 | 0 | NA | NA | NA |
In_Frame_Ins | NA | 0 | 0 | NA | NA | NA |
Frame_Shift_Del | NA | 0 | 0 | NA | NA | NA |
Frame_Shift_Ins | NA | 0 | 0 | NA | NA | NA |
Intron | 1 | 1 | 1 | 1 | 1 | 1 |
3’UTR | 1 | 1 | 1 | 1 | 1 | 1 |
3’Flank | 1 | 1 | 1 | 1 | 1 | 1 |
RNA | 1 | 1 | 1 | 1 | 1 | 1 |
IGR | 1 | 1 | 1 | 1 | 1 | 1 |
The user can investigate the predicted effect of different mutation types on the translational level through the table LOC_translation:
Variant_Classification | SNP | INS | DEL | DNP | TNP | ONP |
---|---|---|---|---|---|---|
5’Flank | 1 | 1 | 1 | 1 | 1 | 1 |
5’UTR | 1 | 1 | 1 | 1 | 1 | 1 |
Translation_Start_Site | 1 | 1 | 1 | 1 | 1 | 1 |
Missense_Mutation | 0 | NA | NA | 0 | 0 | 0 |
Nonsense_Mutation | 1 | 1 | 1 | 1 | 1 | 1 |
Nonstop_Mutation | 1 | 1 | 1 | 1 | 1 | 1 |
Splice_Site | 1 | 1 | 1 | 1 | 1 | 1 |
Splice_Region | 1 | 1 | 1 | 1 | 1 | 1 |
Silent | 1 | NA | NA | 1 | 1 | 1 |
In_Frame_Del | NA | 0 | 0 | NA | NA | NA |
In_Frame_Ins | NA | 0 | 0 | NA | NA | NA |
Frame_Shift_Del | NA | 0 | 0 | NA | NA | NA |
Frame_shift_Ins | NA | 0 | 0 | NA | NA | NA |
Intron | 1 | 1 | 1 | 1 | 1 | 1 |
3’UTR | 1 | 1 | 1 | 1 | 1 | 1 |
3’Flank | 1 | 1 | 1 | 1 | 1 | 1 |
IGR | 1 | 1 | 1 | 1 | 1 | 1 |
RNA | 1 | 1 | 1 | 1 | 1 | 1 |
The user can investigate the predicted effect of different mutation types on the protein level through the table LOC_protein:
Variant_Classification | SNP | INS | DEL | DNP | TNP | ONP |
---|---|---|---|---|---|---|
5’Flank | 0 | 0 | 0 | 0 | 0 | 0 |
5’UTR | 0 | 0 | 0 | 0 | 0 | 0 |
Translation_Start_Site | 1 | 1 | 1 | 1 | 1 | 1 |
Missense_Mutation | 1 | NA | NA | 1 | 1 | 1 |
Nonsense_Mutation | 1 | 1 | 1 | 1 | 1 | 1 |
Nonstop_Mutation | 1 | 1 | 1 | 1 | 1 | 1 |
Splice_Site | 1 | 1 | 1 | 1 | 1 | 1 |
Splice_Region | 1 | 1 | 1 | 1 | 1 | 1 |
Silent | 0 | NA | NA | 0 | 0 | 0 |
In_Frame_Del | NA | 1 | 1 | NA | NA | NA |
In_Frame_Ins | NA | 1 | 1 | NA | NA | NA |
Frame_Shift_Del | NA | 1 | 1 | NA | NA | NA |
Frame_Shift_Ins | NA | 1 | 1 | NA | NA | NA |
Intron | 1 | 1 | 1 | 1 | 1 | 1 |
3’UTR | 0 | 0 | 0 | 0 | 0 | 0 |
3’Flank | 0 | 0 | 0 | 0 | 0 | 0 |
RNA | 0 | 0 | 0 | 0 | 0 | 0 |
IGR | 0 | 0 | 0 | 0 | 0 | 0 |
plotNetworkHive
: GRN hive visualization taking into account COSMIC cancer genesIn the following plot the nodes are separated into three groups: known tumor suppressor genes (yellow), known oncogenes (green) and the rest (gray).
plotDMA
: Heatmap of the driver/passenger status of mutations in TSGs/OCGsIn the following plot the driver genes with driver mutations are shown.
plotMoonlight
: Heatmap of Moonlight gene z-score for the TSGs/OCGsIn the following plot the top 50 genes with the most driver mutations are visualised. The values are the moonlight gene z-score for the two biological processes
This vignette shows a complete workflow of the ‘Moonlight2R’ package. The code is divided into three case studies:
For illustrative purposes and to decrease runtime, we have set nGenesPerm = 5
and nBoot = 5
in the call of GRN
in the following code block, however, we
recommend setting these parameters to nGenesPerm = 2000
and nBoot = 400
to
achieve optimal results, as they are set by default in the function arguments.
data(DEGsmatrix)
data(dataFilt)
data(DiseaseList)
data(EAGenes)
data(tabGrowBlock)
data(knownDriverGenes)
dataFEA <- FEA(DEGsmatrix = DEGsmatrix)
dataGRN <- GRN(TFs = sample(rownames(DEGsmatrix), 100),
DEGsmatrix = DEGsmatrix,
DiffGenes = TRUE,
normCounts = dataFilt,
nGenesPerm = 5,
nBoot = 5,
kNearest = 3)
dataURA <- URA(dataGRN = dataGRN,
DEGsmatrix = DEGsmatrix,
BPname = c("apoptosis",
"proliferation of cells"))
dataDual <- PRA(dataURA = dataURA,
BPname = c("apoptosis",
"proliferation of cells"),
thres.role = 0)
oncogenic_mediators <- list("TSG"=names(dataDual$TSG), "OCG"=names(dataDual$OCG))
plotURA
: Upstream regulatory analysis plotThe user can plot the result of the upstream regulatory analysis using the function plotURA
.
The figure resulted from the code above is shown below:
For illustrative purposes and to decrease runtime, we have set nGenesPerm = 5
and nBoot = 5
in the example below, however, we recommend setting these parameters to
nGenesPerm = 2000
and nBoot = 400
to achieve optimal results, as they are
set by default in the function arguments.
data(dataFilt)
data(DEGsmatrix)
data(dataMAF)
data(DiseaseList)
data(EAGenes)
data(tabGrowBlock)
data(knownDriverGenes)
data(LOC_transcription)
data(LOC_translation)
data(LOC_protein)
data(NCG)
data(EncodePromoters)
listMoonlight <- moonlight(dataDEGs = DEGsmatrix,
dataFilt = dataFilt,
nTF = 100,
DiffGenes = TRUE,
nGenesPerm = 5,
nBoot = 5,
BPname = c("apoptosis","proliferation of cells"),
dataMAF = dataMAF,
path_cscape_coding = "css_coding.vcf.gz",
path_cscape_noncoding = "css_noncoding.vcf.gz")
save(listMoonlight, file = paste0("listMoonlight_ncancer4.Rdata"))
plotCircos
: Moonlight Circos PlotAn example of running Moonlight on five cancer types is visualized below in a circos plot. Outer ring: color by cancer type, Inner ring: OCGs and TSGs, Inner connections: green: common OCGs yellow: common TSGs red: possible dual role
The figure generated by the code above is shown below:
For illustrative purposes and to decrease runtime, we have set nGenesPerm = 5
and nBoot = 5
in the example below, however, we recommend setting these parameters to
nGenesPerm = 2000
and nBoot = 400
to achieve optimal results, as they are
set by default in the function arguments.
data(DEGsmatrix)
data(dataFilt)
data(dataMAF)
data(DiseaseList)
data(EAGenes)
data(tabGrowBlock)
data(knownDriverGenes)
data(LOC_transcription)
data(LOC_translation)
data(LOC_protein)
data(NCG)
data(EncodePromoters)
# Perform gene regulatory network analysis
dataGRN <- GRN(TFs = rownames(DEGsmatrix),
DEGsmatrix = DEGsmatrix,
DiffGenes = TRUE,
normCounts = dataFilt,
nGenesPerm = 5,
kNearest = 3,
nBoot = 5)
# Perform upstream regulatory analysis
# As example, we use apoptosis and proliferation of cells as the biological processes
dataURA <- URA(dataGRN = dataGRN,
DEGsmatrix = DEGsmatrix,
BPname = c("apoptosis",
"proliferation of cells"),
nCores = 1)
# Perform pattern recognition analysis
dataPRA <- PRA(dataURA = dataURA,
BPname = c("apoptosis",
"proliferation of cells"),
thres.role = 0)
# Perform driver mutation analysis
dataDMA <- DMA(dataMAF = dataMAF,
dataDEGs = DEGsmatrix,
dataPRA = dataPRA,
results_folder = "DMAresults",
coding_file = "css_coding.vcf.gz",
noncoding_file = "css_noncoding.vcf.gz")
Next, we analyze the predicted driver genes and their mutations.
data(Oncogenic_mediators_mutation_summary)
data(DEG_Mutations_Annotations)
# Extract oncogenic mediators that contain at least one driver mutation
# These are the driver genes
knitr::kable(Oncogenic_mediators_mutation_summary %>%
filter(!is.na(CScape_Driver)))
Hugo_Symbol | Moonlight_Oncogenic_Mediator | CScape_Passenger | CScape_Driver | CScape_Unclassified | Transcription_mut_sum | Translation_mut_sum | Protein_mut_sum | Total_Mutations | NCG_driver | NCG_cgc_annotation | NCG_vogelstein_annotation | NCG_saito_annotation | NCG_pubmed_id | NCG_cancer_type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ABCG2 | OCG | NA | 1 | NA | 0 | 0 | 1 | 1 | Candidate | NA | NA | NA | 30718927 | esophageal_adenocarcinoma |
# Extract mutation annotations of the predicted driver genes
driver_mut <- DEG_Mutations_Annotations %>%
filter(!is.na(Moonlight_Oncogenic_Mediator),
CScape_Mut_Class == "Driver")
# Extract driver genes with a predicted effect on the transcriptional level
transcription_mut <- Oncogenic_mediators_mutation_summary %>%
filter(!is.na(CScape_Driver)) %>%
filter(Transcription_mut_sum > 0)
# Extract mutation annotations of predicted driver genes that have a driver mutation
# in its promoter region with a potential effect on the transcriptional level
promoters <- DEG_Mutations_Annotations %>%
filter(!is.na(Moonlight_Oncogenic_Mediator),
CScape_Mut_Class == "Driver",
Potential_Effect_on_Transcription == 1,
Annotation == 'Promoter')
Please cite the MoonlightR and Moonlight2R packages:
“Interpreting pathways to discover cancer driver genes with Moonlight.” Nature Communications (2020): 10.1038/s41467-019-13803-0. (Colaprico, Antonio and Olsen, Catharina and Bailey, Matthew H. and Odom, Gabriel J. and Terkelsen, Thilde and Silva, Tiago C. and Olsen, André V. and Cantini, Laura and Zinovyev, Andrei and Barillot, Emmanuel and Noushmehr, Houtan and Bertoli, Gloria and Castiglioni, Isabella and Cava, Claudia and Bontempi, Gianluca and Chen, Xi Steven and Papaleo, Elena 2020)
“A workflow to study mechanistic indicators for driver gene prediction with Moonlight.” Briefings in Bioinformatics (2023): 10.1093/bib/bbad274. (Nourbakhsh, Mona and Saksager, Astrid and Tom, Nikola and Chen, Xi Steven and Colaprico, Antonio and Olsen, Catharina and Tiberti, Matteo and Papaleo, Elena 2023)
Related publications to this vignette:
“TCGAbiolinks: an R/Bioconductor package for integrative analysis of TCGA data.” Nucleic acids research (2015): gkv1507. (Colaprico, Antonio and Silva, Tiago C. and Olsen, Catharina and Garofano, Luciano and Cava, Claudia and Garolini, Davide and Sabedot, Thais S. and Malta, Tathiane M. and Pagnotta, Stefano M. and Castiglioni, Isabella and Ceccarelli, Michele and Bontempi, Gianluca and Noushmehr, Houtan 2016)
“TCGA Workflow: Analyze cancer genomics and epigenomics data using Bioconductor packages”. F1000Research 10.12688/f1000research.8923.1 (Silva, TC and Colaprico, A and Olsen, C and D’Angelo, F and Bontempi, G and Ceccarelli, M and Noushmehr, H 2016)
Session Information ******
## R version 4.3.1 (2023-06-16)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 22.04.3 LTS
##
## Matrix products: default
## BLAS: /home/biocbuild/bbs-3.18-bioc/R/lib/libRblas.so
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_GB LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: America/New_York
## tzcode source: system (glibc)
##
## attached base packages:
## [1] parallel stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] dplyr_1.1.3 magrittr_2.0.3 Moonlight2R_1.0.0 doParallel_1.0.17
## [5] iterators_1.0.14 foreach_1.5.2 BiocStyle_2.30.0
##
## loaded via a namespace (and not attached):
## [1] fs_1.6.3 matrixStats_1.0.0
## [3] bitops_1.0-7 HiveR_0.3.63
## [5] enrichplot_1.22.0 HDO.db_0.99.1
## [7] httr_1.4.7 RColorBrewer_1.1-3
## [9] tools_4.3.1 utf8_1.2.4
## [11] R6_2.5.1 lazyeval_0.2.2
## [13] GetoptLong_1.0.5 withr_2.5.1
## [15] gridExtra_2.3 RISmed_2.3.0
## [17] cli_3.6.1 Biobase_2.62.0
## [19] Cairo_1.6-1 scatterpie_0.2.1
## [21] sass_0.4.7 readr_2.1.4
## [23] randomForest_4.7-1.1 askpass_1.2.0
## [25] Rsamtools_2.18.0 yulab.utils_0.1.0
## [27] gson_0.1.0 DOSE_3.28.0
## [29] R.utils_2.12.2 limma_3.58.0
## [31] RSQLite_2.3.1 generics_0.1.3
## [33] gridGraphics_0.5-1 shape_1.4.6
## [35] BiocIO_1.12.0 gtools_3.9.4
## [37] dendextend_1.17.1 GO.db_3.18.0
## [39] Matrix_1.6-1.1 fansi_1.0.5
## [41] S4Vectors_0.40.0 abind_1.4-5
## [43] R.methodsS3_1.8.2 lifecycle_1.0.3
## [45] yaml_2.3.7 SummarizedExperiment_1.32.0
## [47] gplots_3.1.3 qvalue_2.34.0
## [49] SparseArray_1.2.0 BiocFileCache_2.10.0
## [51] grid_4.3.1 blob_1.2.4
## [53] promises_1.2.1 crayon_1.5.2
## [55] lattice_0.22-5 cowplot_1.1.1
## [57] KEGGREST_1.42.0 magick_2.8.1
## [59] pillar_1.9.0 knitr_1.44
## [61] ComplexHeatmap_2.18.0 fgsea_1.28.0
## [63] GenomicRanges_1.54.0 tcltk_4.3.1
## [65] rjson_0.2.21 codetools_0.2-19
## [67] fastmatch_1.1-4 glue_1.6.2
## [69] ggfun_0.1.3 qpdf_1.3.2
## [71] data.table_1.14.8 parmigene_1.1.0
## [73] vctrs_0.6.4 png_0.1-8
## [75] treeio_1.26.0 gtable_0.3.4
## [77] cachem_1.0.8 xfun_0.40
## [79] S4Arrays_1.2.0 mime_0.12
## [81] tidygraph_1.2.3 statmod_1.5.0
## [83] rgl_1.2.1 interactiveDisplayBase_1.40.0
## [85] ellipsis_0.3.2 nlme_3.1-163
## [87] ggtree_3.10.0 bit64_4.0.5
## [89] filelock_1.0.2 GenomeInfoDb_1.38.0
## [91] bslib_0.5.1 KernSmooth_2.23-22
## [93] colorspace_2.1-0 BiocGenerics_0.48.0
## [95] DBI_1.1.3 tidyselect_1.2.0
## [97] bit_4.0.5 compiler_4.3.1
## [99] extrafontdb_1.0 curl_5.1.0
## [101] tidyHeatmap_1.8.1 xml2_1.3.5
## [103] DelayedArray_0.28.0 bookdown_0.36
## [105] shadowtext_0.1.2 rtracklayer_1.62.0
## [107] scales_1.2.1 caTools_1.18.2
## [109] fuzzyjoin_0.1.6 rappdirs_0.3.3
## [111] stringr_1.5.0 digest_0.6.33
## [113] rmarkdown_2.25 GEOquery_2.70.0
## [115] XVector_0.42.0 htmltools_0.5.6.1
## [117] pkgconfig_2.0.3 jpeg_0.1-10
## [119] base64enc_0.1-3 extrafont_0.19
## [121] MatrixGenerics_1.14.0 dbplyr_2.3.4
## [123] fastmap_1.1.1 rlang_1.1.1
## [125] GlobalOptions_0.1.2 htmlwidgets_1.6.2
## [127] shiny_1.7.5.1 farver_2.1.1
## [129] jquerylib_0.1.4 jsonlite_1.8.7
## [131] BiocParallel_1.36.0 GOSemSim_2.28.0
## [133] R.oo_1.25.0 RCurl_1.98-1.12
## [135] GenomeInfoDbData_1.2.11 ggplotify_0.1.2
## [137] patchwork_1.1.3 munsell_0.5.0
## [139] Rcpp_1.0.11 ape_5.7-1
## [141] viridis_0.6.4 stringi_1.7.12
## [143] ggraph_2.1.0 zlibbioc_1.48.0
## [145] MASS_7.3-60 AnnotationHub_3.10.0
## [147] plyr_1.8.9 org.Hs.eg.db_3.18.0
## [149] HPO.db_0.99.2 ggrepel_0.9.4
## [151] Biostrings_2.70.0 graphlayouts_1.0.1
## [153] splines_4.3.1 hms_1.1.3
## [155] circlize_0.4.15 seqminer_9.1
## [157] igraph_1.5.1 reshape2_1.4.4
## [159] stats4_4.3.1 BiocVersion_3.18.0
## [161] XML_3.99-0.14 evaluate_0.22
## [163] BiocManager_1.30.22 tzdb_0.4.0
## [165] tweenr_2.0.2 httpuv_1.6.12
## [167] Rttf2pt1_1.3.12 tidyr_1.3.0
## [169] purrr_1.0.2 polyclip_1.10-6
## [171] clue_0.3-65 ggplot2_3.4.4
## [173] ggforce_0.4.1 xtable_1.8-4
## [175] restfulr_0.0.15 easyPubMed_2.13
## [177] tidytree_0.4.5 MPO.db_0.99.7
## [179] later_1.3.1 viridisLite_0.4.2
## [181] tibble_3.2.1 clusterProfiler_4.10.0
## [183] aplot_0.2.2 memoise_2.0.1
## [185] AnnotationDbi_1.64.0 GenomicAlignments_1.38.0
## [187] IRanges_2.36.0 cluster_2.1.4
Colaprico, Antonio and Olsen, Catharina and Bailey, Matthew H. and Odom, Gabriel J. and Terkelsen, Thilde and Silva, Tiago C. and Olsen, André V. and Cantini, Laura and Zinovyev, Andrei and Barillot, Emmanuel and Noushmehr, Houtan and Bertoli, Gloria and Castiglioni, Isabella and Cava, Claudia and Bontempi, Gianluca and Chen, Xi Steven and Papaleo, Elena. 2020. “Interpreting Pathways to Discover Cancer Driver Genes with Moonlight.” https://doi.org/10.1038/s41467-019-13803-0.
Colaprico, Antonio and Silva, Tiago C. and Olsen, Catharina and Garofano, Luciano and Cava, Claudia and Garolini, Davide and Sabedot, Thais S. and Malta, Tathiane M. and Pagnotta, Stefano M. and Castiglioni, Isabella and Ceccarelli, Michele and Bontempi, Gianluca and Noushmehr, Houtan. 2016. “TCGAbiolinks: An R/Bioconductor Package for Integrative Analysis of TCGA Data.” https://doi.org/10.1093/nar/gkv1507.
Nourbakhsh, Mona and Saksager, Astrid and Tom, Nikola and Chen, Xi Steven and Colaprico, Antonio and Olsen, Catharina and Tiberti, Matteo and Papaleo, Elena. 2023. “A Workflow to Study Mechanistic Indicators for Driver Gene Prediction with Moonlight.” https://doi.org/10.1093/bib/bbad274.
Silva, TC and Colaprico, A and Olsen, C and D’Angelo, F and Bontempi, G and Ceccarelli, M and Noushmehr, H. 2016. “TCGA Workflow: Analyze Cancer Genomics and Epigenomics Data Using Bioconductor Packages.” https://doi.org/10.12688/f1000research.8923.1.