Netbenchmark

Pau Bellot, Catharina Olsen, Patrick Meyer

2019-05-02

In the last decade, several methods have tackled the challenge of reconstructing gene regulatory networks from gene expression data. Several papers have compared and evaluated the different network inference methods relying on simulated data.

This is a new comparison that assesses different methods in a high-heterogeneity data scenario which could reveal the specialization of methods for the different network types and data.

This package allows repeating the comparison between different network inference algorithms with only one line of code.

This package allows replication this comparison between the different networks inference algorithms with only one line of code.

Toy example for main benchmark:

    library(netbenchmark)
## Loading required package: grndata
    top20.aupr <- netbenchmark(methods="all",datasources.names = "Toy",
                               local.noise=20,global.noise=10,
                               noiseType=c("normal","lognormal"),
                               datasets.num = 2,experiments = 40,
                               seed=1422976420,verbose=FALSE)
## Warning in netbenchmark(methods = "all", datasources.names = "Toy", local.noise = 20, : The specified number of experiments and 
## datasets is bigger than the orginal number of experiments in the datasource: 
## toy, sampling with replacement will be used
## Estimate (local) false discovery rates (partial correlations):
## Estimate (local) false discovery rates (partial correlations):

The first element of the returned list is the \(AUPR_{20}\):

   print(top20.aupr[[1]])
##   Origin experiments aracne.wrap c3net.wrap  clr.wrap GeneNet.wrap
## 1    toy          48   0.1396479 0.09855063 0.1587498   0.05371782
## 2    toy          35   0.1013709 0.07358528 0.1260297   0.11244352
##   Genie3.wrap mrnet.wrap mutrank.wrap mrnetb.wrap pcit.wrap zscore.wrap
## 1   0.1552356  0.1689863   0.08582062   0.1815839 0.1680160  0.03004052
## 2   0.1573182  0.1260915   0.08715210   0.1368010 0.1556404  0.02381227
##         rand
## 1 0.03077818
## 2 0.01840262

The package provides an easy way to compare new techniques with
state-of-the-art ones and to make new different benchmarks in the future.

First, define the wrapper functions:

    Spearmancor <- function(data){
        cor(data,method="spearman")
    }

    Pearsoncor <- function(data){
        cor(data,method="pearson")
   }

Note that the wrapper function returns a matrix which is the weighted adjacency matrix of the network inferred by the algorithm and that the columns and rows are named.

Evaluate five times these two simple inference methods with syntren300 datasource:

    res <- netbenchmark(datasources.names="syntren300",
        methods=c("Spearmancor","Pearsoncor"),verbose=FALSE)
    aupr <- res[[1]][,-(1:2)]

Make a boxplot of the \(AUPR_{20}\) results:

    boxplot(aupr, main="Syntren300",ylab=expression('AUPR'[20]))

Plot the mean Precision-Recall curves:

    PR <- res[[5]][[1]]
    col <- rainbow(3)
    plot(PR$rec[,1],PR$pre[,1],type="l",lwd=3,col=col[1],xlab="Recall",
        ylab="Precision",main="Syntren300",xlim=c(0,1),ylim=c(0,1))
    lines(PR$rec[,2],PR$pre[,2],type="l",lwd=3,col=col[2])
    lines(PR$rec[,3],PR$pre[,3],type="l",lwd=3,col=col[3])
    legend("topright", inset=.05,title="Method",colnames(PR$rec),fill=col)

We can also compare these two simple inference methods with the fast network inference algorithms using syntren300 datasource:

    comp <- netbenchmark(datasources.names="syntren300",
        methods=c("all.fast","Spearmancor","Pearsoncor"),verbose=FALSE)
## Estimate (local) false discovery rates (partial correlations):
## Estimate (local) false discovery rates (partial correlations):
## Estimate (local) false discovery rates (partial correlations):
## Estimate (local) false discovery rates (partial correlations):
## Estimate (local) false discovery rates (partial correlations):
    aupr <- comp[[1]][,-(1:2)]

Make a boxplot the \(AUPR_{20}\) results:

    #make the name look prety
    library("tools")
    colnames(aupr) <- sapply(colnames(aupr),file_path_sans_ext)
    boxplot(aupr, main="Syntren300", ylab=expression('AUPR'[20]))