This vignette contains some minimal examples for the main pathlinkR functions; for more complete documentation, please see our Github pages.
Often times, gene expression studies such as microarrays and RNA-Seq result in hundreds to thousands of differentially expressed genes (DEGs). It becomes very difficult to understand the biological significance of such massive data sets, especially when there are multiple conditions and comparisons being analyzed. This package facilitates visualization and downstream analyses of differential gene expression results, using pathway enrichment and protein-protein interaction networks, to aid researchers in uncovering underlying biology and pathophysiology from their gene expression studies.
We have included an example data set of gene expression results in this package
as the object exampleDESeqResults
. This is a list of 2 data frames, generated
using the results()
functions from the package
DESeq2 (Love et al. 2014). The data is from an RNA-Seq
study investigating COVID-19 and non-COVID-19 sepsis patients at admission (T1)
compared to approximately1 week later (T2) in the ICU, indexed over time (i.e.,
T2 vs T1) (An et al. 2023).
To install and load the package:
# We'll also be using some functions from dplyr
# BiocManager::install("pathlinkR", version="devel")
library(dplyr)
library(pathlinkR)
One of the first visualizations commonly performed with gene expression studies
is to identify the number of DEGs. These are typically defined using specific
cutoffs for both fold change and statistical significance. Thresholds of
adjusted p-value <0.05 and absolute fold change >1.5 are used as the default,
though any value can be specified. pathlinkR includes the function
eruption()
to create a volcano plot.
## A quick look at the DESeq2 results table
data("exampleDESeqResults")
knitr::kable(head(exampleDESeqResults[[1]]))
baseMean | log2FoldChange | lfcSE | stat | pvalue | padj | |
---|---|---|---|---|---|---|
ENSG00000000938 | 16292.64814 | -0.5624954 | 0.1458274 | -3.857268 | 0.0001147 | 0.0013531 |
ENSG00000002586 | 1719.51750 | 0.4501181 | 0.1520122 | 2.961066 | 0.0030658 | 0.0153820 |
ENSG00000002919 | 870.64168 | -0.2445729 | 0.1249293 | -1.957690 | 0.0502664 | 0.1236844 |
ENSG00000002933 | 266.65476 | 0.8838310 | 0.2093628 | 4.221528 | 0.0000243 | 0.0004313 |
ENSG00000003249 | 11.43282 | 1.3287128 | 0.2881385 | 4.611369 | 0.0000040 | 0.0001200 |
ENSG00000003509 | 207.88545 | -0.1825614 | 0.1763556 | -1.035189 | 0.3005807 | 0.4453252 |
## Generate a volcano plot from the first data frame, with default thresholds
eruption(
rnaseqResult=exampleDESeqResults[[1]],
title=names(exampleDESeqResults[1])
)