proBatch {proBatch} | R Documentation |
The proBatch package contains functions for analyzing and correcting batch effects and other unwanted technical variation from high-thoughput experiments. Although the package has primarily been developed for mass spectrometry proteomics (DIA/SWATH), it should also be applicable to most omic data with minor adaptations. It addresses the following needs:
prepare the data for analysis
Visualize batch effects in sample-wide and feature-level;
Normalize and correct for batch effects.
df_long |
data frame where each row is a single feature in a single
sample. It minimally has a |
data_matrix |
features (in rows) vs samples (in columns) matrix, with feature IDs in rownames and file/sample names as colnames. Usually the log transformed version of the original data |
sample_annotation |
data matrix with:
|
sample_id_col |
name of the column in sample_annotation file, where the filenames (colnames of the data matrix are found) |
batch_col |
column in |
order_col |
column in |
measure_col |
if |
feature_id_col |
name of the column with feature/gene/peptide/protein
ID used in the long format representation |
plot_title |
Title of the plot (usually, processing step + representation level (fragments, transitions, proteins)) |
theme |
ggplot theme, by default |
To learn more about proBatch, start with the vignettes:
browseVignettes(package = "proBatch")
Common arguments to the functions.