RNAseqCovarImpute

This is the development version of RNAseqCovarImpute; for the stable release version, see RNAseqCovarImpute.

Impute Covariate Data in RNA Sequencing Studies


Bioconductor version: Development (3.21)

The RNAseqCovarImpute package makes linear model analysis for RNA sequencing read counts compatible with multiple imputation (MI) of missing covariates. A major problem with implementing MI in RNA sequencing studies is that the outcome data must be included in the imputation prediction models to avoid bias. This is difficult in omics studies with high-dimensional data. The first method we developed in the RNAseqCovarImpute package surmounts the problem of high-dimensional outcome data by binning genes into smaller groups to analyze pseudo-independently. This method implements covariate MI in gene expression studies by 1) randomly binning genes into smaller groups, 2) creating M imputed datasets separately within each bin, where the imputation predictor matrix includes all covariates and the log counts per million (CPM) for the genes within each bin, 3) estimating gene expression changes using `limma::voom` followed by `limma::lmFit` functions, separately on each M imputed dataset within each gene bin, 4) un-binning the gene sets and stacking the M sets of model results before applying the `limma::squeezeVar` function to apply a variance shrinking Bayesian procedure to each M set of model results, 5) pooling the results with Rubins’ rules to produce combined coefficients, standard errors, and P-values, and 6) adjusting P-values for multiplicity to account for false discovery rate (FDR). A faster method uses principal component analysis (PCA) to avoid binning genes while still retaining outcome information in the MI models. Binning genes into smaller groups requires that the MI and limma-voom analysis is run many times (typically hundreds). The more computationally efficient MI PCA method implements covariate MI in gene expression studies by 1) performing PCA on the log CPM values for all genes using the Bioconductor `PCAtools` package, 2) creating M imputed datasets where the imputation predictor matrix includes all covariates and the optimum number of PCs to retain (e.g., based on Horn’s parallel analysis or the number of PCs that account for >80% explained variation), 3) conducting the standard limma-voom pipeline with the `voom` followed by `lmFit` followed by `eBayes` functions on each M imputed dataset, 4) pooling the results with Rubins’ rules to produce combined coefficients, standard errors, and P-values, and 5) adjusting P-values for multiplicity to account for false discovery rate (FDR).

Author: Brennan Baker [aut, cre] (ORCID: ), Sheela Sathyanarayana [aut], Adam Szpiro [aut], James MacDonald [aut], Alison Paquette [aut]

Maintainer: Brennan Baker <brennanhilton at gmail.com>

Citation (from within R, enter citation("RNAseqCovarImpute")):

Installation

To install this package, start R (version "4.5") and enter:


if (!require("BiocManager", quietly = TRUE))
    install.packages("BiocManager")

# The following initializes usage of Bioc devel
BiocManager::install(version='devel')

BiocManager::install("RNAseqCovarImpute")

For older versions of R, please refer to the appropriate Bioconductor release.

Documentation

To view documentation for the version of this package installed in your system, start R and enter:

browseVignettes("RNAseqCovarImpute")
Example Data for RNAseqCovarImpute HTML R Script
Impute Covariate Data in RNA-sequencing Studies HTML R Script
Reference Manual PDF
NEWS Text

Details

biocViews DifferentialExpression, GeneExpression, RNASeq, Sequencing, Software
Version 1.5.0
In Bioconductor since BioC 3.18 (R-4.3) (1.5 years)
License GPL-3
Depends R (>= 4.3.0)
Imports Biobase, BiocGenerics, BiocParallel, stats, limma, dplyr, magrittr, rlang, edgeR, foreach, mice
System Requirements
URL https://github.com/brennanhilton/RNAseqCovarImpute
Bug Reports https://github.com/brennanhilton/RNAseqCovarImpute/issues
See More
Suggests BiocStyle, knitr, PCAtools, rmarkdown, tidyr, stringr, testthat (>= 3.0.0)
Linking To
Enhances
Depends On Me
Imports Me
Suggests Me
Links To Me
Build Report Build Report

Package Archives

Follow Installation instructions to use this package in your R session.

Source Package RNAseqCovarImpute_1.5.0.tar.gz
Windows Binary (x86_64) RNAseqCovarImpute_1.5.0.zip
macOS Binary (x86_64) RNAseqCovarImpute_1.5.0.tgz
macOS Binary (arm64) RNAseqCovarImpute_1.5.0.tgz
Source Repository git clone https://git.bioconductor.org/packages/RNAseqCovarImpute
Source Repository (Developer Access) git clone git@git.bioconductor.org:packages/RNAseqCovarImpute
Bioc Package Browser https://code.bioconductor.org/browse/RNAseqCovarImpute/
Package Short Url https://bioconductor.org/packages/RNAseqCovarImpute/
Package Downloads Report Download Stats