Contents

This vignette contains some minimal examples for the main pathlinkR functions; for more complete documentation, please see our Github pages.

1 Introduction

Often times, gene expression studies such as microarrays and RNA-Seq result in hundreds to thousands of differentially expressed genes (DEGs). It becomes very difficult to understand the biological significance of such massive data sets, especially when there are multiple conditions and comparisons being analyzed. This package facilitates visualization and downstream analyses of differential gene expression results, using pathway enrichment and protein-protein interaction networks, to aid researchers in uncovering underlying biology and pathophysiology from their gene expression studies.

We have included an example data set of gene expression results in this package as the object exampleDESeqResults. This is a list of 2 data frames, generated using the results() functions from the package DESeq2 (Love et al. 2014). The data is from an RNA-Seq study investigating COVID-19 and non-COVID-19 sepsis patients at admission (T1) compared to approximately1 week later (T2) in the ICU, indexed over time (i.e., T2 vs T1) (An et al. 2023).

2 Installation

To install and load the package:

# We'll also be using some functions from dplyr
# BiocManager::install("pathlinkR", version="devel")
library(dplyr)
library(pathlinkR)

3 Visualizing RNA-Seq data with volcano plots

One of the first visualizations commonly performed with gene expression studies is to identify the number of DEGs. These are typically defined using specific cutoffs for both fold change and statistical significance. Thresholds of adjusted p-value <0.05 and absolute fold change >1.5 are used as the default, though any value can be specified. pathlinkR includes the function eruption() to create a volcano plot.

## A quick look at the DESeq2 results table
data("exampleDESeqResults")
knitr::kable(head(exampleDESeqResults[[1]]))
baseMean log2FoldChange lfcSE stat pvalue padj
ENSG00000000938 16292.64814 -0.5624954 0.1458274 -3.857268 0.0001147 0.0013531
ENSG00000002586 1719.51750 0.4501181 0.1520122 2.961066 0.0030658 0.0153820
ENSG00000002919 870.64168 -0.2445729 0.1249293 -1.957690 0.0502664 0.1236844
ENSG00000002933 266.65476 0.8838310 0.2093628 4.221528 0.0000243 0.0004313
ENSG00000003249 11.43282 1.3287128 0.2881385 4.611369 0.0000040 0.0001200
ENSG00000003509 207.88545 -0.1825614 0.1763556 -1.035189 0.3005807 0.4453252
## Generate a volcano plot from the first data frame, with default thresholds
eruption(
    rnaseqResult=exampleDESeqResults[[1]],
    title=names(exampleDESeqResults[1])
)