LORD {onlineFDR}R Documentation

Online FDR control based on recent discovery

Description

Implements the LORD procedure for online FDR control where LORD stands for (significance) Levels based On Recent Discovery, as presented by Javanmard and Montanari (2018).

Usage

LORD(d, alpha = 0.05, gammai, version = 3, w0 = alpha/10, b0 = alpha -
  w0, random = TRUE, date.format = "%Y-%m-%d")

Arguments

d

Dataframe with three columns: an identifier (‘id’), date (‘date’) and p-value (‘pval’). If no column of dates is provided, then the p-values are treated as being ordered sequentially with no batches.

alpha

Overall significance level of the FDR procedure, the default is 0.05.

gammai

Optional vector of γ_i. A default is provided as proposed by Javanmard and Montanari (2018), equation 31.

version

An integer from 1 to 3 giving the version of LORD to use. Defaults to 3.

w0

Initial ‘wealth’ of the procedure. Defaults to α/10.

b0

The 'payout' for rejecting a hypothesis. Defaults to α - w_0.

random

Logical. If TRUE (the default), then the order of the p-values in each batch (i.e. those that have exactly the same date) is randomised.

date.format

Optional string giving the format that is used for dates.

Details

The function takes as its input a dataframe with three columns: an identifier (‘id’), date (‘date’) and p-value (‘pval’). The case where p-values arrive in batches corresponds to multiple instances of the same date. If no column of dates is provided, then the p-values are treated as being ordered sequentially with no batches.

The LORD procedure controls FDR for independent p-values. Given an overall significance level α, we choose a sequence of non-negative numbers γ_i such that they sum to 1, and γ_i ≥q γ_j for i ≤q j.

Javanmard and Montanari (2018) present three versions of LORD which differ in the way the adjusted test levels α_i are calculated. The test levels for LORD 1 are based on the time of the last discovery (i.e. hypothesis rejection), LORD 2 are based on all previous discovery times, and LORD 3 are based on the time of the last discovery as well as the 'wealth' accumulated at that time.

LORD depends on constants w_0 and b_0, where w_0 ≥ 0 represents the intial ‘wealth’ of the procedure and b_0 > 0 represents the ‘payout’ for rejecting a hypothesis. We require w_0+b_0 ≤ α for FDR control to hold.

Note that FDR control also holds for the LORD procedure if only the p-values corresponding to true nulls are mutually independent, and independent from the non-null p-values.

Further details of the LORD procedure can be found in Javanmard and Montanari (2018).

Value

d.out

A dataframe with the original dataframe d (which will be reordered if there are batches and random = TRUE), the LOND-adjusted test levels α_i and the indicator function of discoveries R. Hypothesis i is rejected if the i-th p-value is less than or equal to α_i, in which case R[i] = 1 (otherwise R[i] = 0).

References

Javanmard, A. and Montanari, A. (2018) Online Rules for Control of False Discovery Rate and False Discovery Exceedance. Annals of Statistics, 46(2):526-554.

See Also

LORDdep uses a modified version of the LORD algorithm that is valid for dependent p-values.

Examples

sample.df <- data.frame(
id = c('A15432', 'B90969', 'C18705', 'B49731', 'E99902',
    'C38292', 'A30619', 'D46627', 'E29198', 'A41418',
    'D51456', 'C88669', 'E03673', 'A63155', 'B66033'),
date = as.Date(c(rep("2014-12-01",3),
                rep("2015-09-21",5),
                rep("2016-05-19",2),
                "2016-11-12",
                rep("2017-03-27",4))),
pval = c(2.90e-17, 0.06743, 0.01514, 0.08174, 0.00171,
        3.60e-05, 0.79149, 0.27201, 0.28295, 7.59e-08,
        0.69274, 0.30443, 0.00136, 0.72342, 0.54757))

LORD(sample.df, random=FALSE)
set.seed(1); LORD(sample.df, version=2)
set.seed(1); LORD(sample.df, alpha=0.1, w0=0.05)



[Package onlineFDR version 1.0.0 Index]