affiXcanTrain {AffiXcan} | R Documentation |
Train the model needed to impute a GReX for each gene
affiXcanTrain(exprMatrix, assay, tbaPaths, regionAssoc, cov, varExplained, scale, BPPARAM = bpparam())
exprMatrix |
A SummarizedExperiment object containing expression data |
assay |
A string with the name of the object in SummarizedExperiment::assays(exprMatrix) that contains expression values |
tbaPaths |
A vector of strings, which are the paths to MultiAssayExperiment RDS files containing the tba values |
regionAssoc |
A data.frame with the association between regulatory regions and expressed genes and with colnames = c("REGULATORY_REGION", "EXPRESSED_REGION") |
cov |
A data.frame with covariates values for the population structure where the columns are the PCs and the rows are the individual IIDs |
varExplained |
An integer between 0 and 100; varExplained=80 means that the principal components selected to fit the models must explain at least 80 percent of variation of TBA values |
scale |
A logical; if scale=FALSE the TBA values will be only centered, not scaled before performing PCA |
BPPARAM |
A BiocParallelParam object. Default is bpparam(). For details on BiocParallelParam virtual base class see browseVignettes("BiocParallel") |
A list containing three objects: pca, bs, regionsCount
pca: A list containing lists named as the MultiAssayExperiment::experiments() found in the MultiAssayExperiment objects listed in the param tbaPaths. Each of these lists contains two objects:
eigenvectors: A matrix containing eigenvectors for those principal components selected according to the param varExplained
pcs: A matrix containing the principal components values selected according to the param varExplained
bs: A list containing lists named as the REGULATORY_REGIONS found in the param regionAssoc that have a correspondent colname in the experiments stored in MultiAssayExperiment objects listed in the param tbaPaths. Each of the lists in bs contains four objects:
coefficients: The coefficients of the principal components used in the model, completely similar to the "coefficients" from the results of lm()
pval: The uncorrected anova pvalue of the model, retrieved from anova(model, modelReduced, test="F")$'Pr(>F)'[2]
r.sq: The coefficient of determination between the real total expression values and the imputed GReX, retrived from summary(model)$r.squared
correctedP: The p value after the benjamini-hochberg correction for multiple testing, retrived using p.adjust(pvalues, method="BH")
regionsCount: An integer, that is the number of genomic regions taken into account during the training phase
if(interactive()) { trainingTbaPaths <- system.file("extdata","training.tba.toydata.rds", package="AffiXcan") data(exprMatrix) data(regionAssoc) data(trainingCovariates) assay <- "values" training <- affiXcanTrain(exprMatrix=exprMatrix, assay=assay, tbaPaths=trainingTbaPaths, regionAssoc=regionAssoc, cov=trainingCovariates, varExplained=80, scale=TRUE) }