assocTestSingle {GENESIS} | R Documentation |
assocTestSingle
performs genotype association tests
using the null model fit with fitNullModel
.
## S4 method for signature 'SeqVarIterator' assocTestSingle(gdsobj, null.model, test=c("Score", "Score.SPA"), recalc.pval.thresh=0.05, GxE=NULL, sparse=TRUE, imputed=FALSE, male.diploid=TRUE, genome.build=c("hg19", "hg38"), verbose=TRUE) ## S4 method for signature 'GenotypeIterator' assocTestSingle(gdsobj, null.model, test=c("Score", "Score.SPA"), recalc.pval.thresh=0.05, GxE=NULL, male.diploid=TRUE, verbose=TRUE)
gdsobj |
An object of class |
null.model |
A null model object returned by |
test |
A character string specifying the type of test to be performed. The possibilities are |
recalc.pval.thresh |
If test is not "Score", recalculate p-values using the specified 'test' for variants with a Score p-value below this threshold; return the score p-value for all other variants. |
GxE |
A vector of character strings specifying the names of the variables for which a genotype interaction term should be included. If |
sparse |
Logical indicator of whether to read genotypes as sparse Matrix objects; the default is |
imputed |
Logical indicator of whether to read dosages from the "DS" field containing imputed dosages instead of counting the number of alternate alleles. |
male.diploid |
Logical for whether males on sex chromosomes are coded as diploid. |
genome.build |
A character sting indicating genome build; used to identify pseudoautosomal regions on the X and Y chromosomes. |
verbose |
Logical indicator of whether updates from the function should be printed to the console; the default is |
All samples included in null model
will be included in the association test, even if a different set of samples is present in the current filter for gdsobj
.
The effect size estimate is for each copy of the alternate allele. For multiallelic variants, each alternate allele is tested separately.
Sporadic missing genotype values are mean imputed using the allele frequency calculated on all other samples at that variant.
Monomorphic variants (including variants where every sample is a heterozygote) are omitted from the results.
The input GxE
can be used to perform GxE tests. Multiple interaction variables may be specified, but all interaction variables specified must have been included as covariates in fitting the null model with fitNullModel
. When performing GxE analyses, assocTestSingle
will report two tests: (1) the joint test of all genotype interaction terms in the model (this is the test for any genotype interaction effect), and (2) the joint test of the genotype term along with all of the genotype interaction terms (this is the test for any genetic effect). Individual genotype interaction terms can be tested by creating test statistics from the reported effect size estimates and their standard errors (Note: when GxE
contains a single continuous or binary covariate, this test is the same as the test for any genotype interaction effect mentioned above).
The saddle point approximation (SPA), run by using test = "Score.SPA"
, implements the method described by Dey et al. (2017), which was extended to mixed models by Zhou et al. (2018) in their SAIGE software. SPA provides better calibration of p-values when fitting mixed models with a binomial family for a sample with an imbalanced case to control ratio.
For the GenotypeIterator
method, objects created with GdsGenotypeReader
or MatrixGenotypeReader
are supported. NcdfGenotypeReader
objects are not supported.
A data.frame where each row refers to a different variant with the columns:
variant.id |
The variant ID |
chr |
The chromosome value |
pos |
The base pair position |
allele.index |
The index of the alternate allele. For biallelic variants, this will always be 1. |
n.obs |
The number of samples with non-missing genotypes |
freq |
The estimated alternate allele frequency |
MAC |
The minor allele count. For multiallelic variants, "minor" is determined by comparing the count of the alternate allele specified by |
If test
is "Score"
:
Score |
The value of the score function |
Score.SE |
The estimated standard error of the Score |
Score.Stat |
The score Z test statistic |
Score.pval |
The score p-value |
Est |
An approximation of the effect size estimate for each additional copy of the alternate allele |
Est.SE |
An approximation of the standard error of the effect size estimate |
PVE |
An approximation of the proportion of phenotype variance explained |
If test
is "Score.SPA"
:
SPA.pval |
The score p-value after applying the saddle point approximation (SPA) |
SPA.converged |
logical indiactor of whether the SPA converged; |
If GxE
is not NULL
:
Est.G |
The effect size estimate for the genotype term |
Est.G:env |
The effect size estimate for the genotype*env interaction term. There will be as many of these terms as there are interaction variables, and "env" will be replaced with the variable name. |
SE.G |
The estimated standard error of the genotype term effect size estimate |
SE.G:env |
The estimated standard error of the genotype*env effect size estimate. There will be as many of these terms as there are interaction variables, and "env" will be replaced with the variable name. |
GxE.Stat |
The Wald Z test statistic for the test of all genotype interaction terms. When there is only one genotype interaction term, this is the test statistic for that term. |
GxE.pval |
The Wald p-value for the test of all genotype interaction terms; i.e. the test of any genotype interaction effect |
Joint.Stat |
The Wald Z test statistic for the joint test of the genotype term and all of the genotype interaction terms |
Joint.pval |
The Wald p-value for the joint test of the genotype term and all of the genotype interaction terms; i.e. the test of any genotype effect |
Matthew P. Conomos, Stephanie M. Gogarten, Tamar Sofer, Ken Rice, Chaoyu Yu
Dey, R., Schmidt, E. M., Abecasis, G. R., & Lee, S. (2017). A fast and accurate algorithm to test for binary phenotypes and its application to PheWAS. The American Journal of Human Genetics, 101(1), 37-49. Zhou, W., Nielsen, J. B., Fritsche, L. G., Dey, R., Gabrielsen, M. E., Wolford, B. N., ... & Bastarache, L. A. (2018). Efficiently controlling for case-control imbalance and sample relatedness in large-scale genetic association studies. Nature genetics, 50(9), 1335.
fitNullModel
for fitting the null mixed model needed as input to assocTestSingle
.
SeqVarIterator
for creating the input object with genotypes.
library(SeqVarTools) library(Biobase) # open a sequencing GDS file gdsfile <- seqExampleFileName("gds") gds <- seqOpen(gdsfile) # simulate some phenotype data set.seed(4) data(pedigree) pedigree <- pedigree[match(seqGetData(gds, "sample.id"), pedigree$sample.id),] pedigree$outcome <- rnorm(nrow(pedigree)) # construct a SeqVarIterator object seqData <- SeqVarData(gds, sampleData=AnnotatedDataFrame(pedigree)) iterator <- SeqVarBlockIterator(seqData) # fit the null model nullmod <- fitNullModel(iterator, outcome="outcome", covars="sex") # run the association test assoc <- assocTestSingle(iterator, nullmod) seqClose(iterator) library(GWASTools) # open a SNP-based GDS file gdsfile <- system.file("extdata", "HapMap_ASW_MXL_geno.gds", package="GENESIS") gds <- GdsGenotypeReader(filename = gdsfile) # simulate some phenotype data set.seed(4) pheno <- data.frame(scanID=getScanID(gds), outcome=rnorm(nscan(gds))) # construct a GenotypeIterator object genoData <- GenotypeData(gds, scanAnnot=ScanAnnotationDataFrame(pheno)) iterator <- GenotypeBlockIterator(genoData) # fit the null model nullmod <- fitNullModel(iterator, outcome="outcome") # run the association test assoc <- assocTestSingle(iterator, nullmod) close(iterator)