There are various ways to detect modules from weighted networks. Conventional approaches such as clustering or graph partitioning purely use the network topology to define modules (Fortunato 2010). But we may need additional information such as increasing high-throughput omics data. On the one hand, the construction of reliable networks, especially for specific tissues, is relatively slow. On the other hand, integrating omics data with network topology has become a paradigm in system biology community in the past decade (Mitra et al. 2013). Weighted gene co-expression network (WGCN) is a pure data-driven gene network, which only relies on gene expression profiles. There is no rigorous definition of active modules in WGCN, but the module itself should be more compact and informative compared with random subnetworks. AMOUNTAIN (Li et al. 2016) provides a convex optimization based approach to identify such modules. Here we embed parts of the examples from the corresponding package AMOUNTAIN help pages into a single document.
We follow (Li et al. 2011) to construct gene co-expression networks for simulation study. Let \(n\) be the number of genes, and edge weights \(W\) as well as node score \(z\) follow the uniform distribution in range \([0,1]\). A module contains \(k\) genes inside which the edge weights as well as node score follow the uniform distribution in range \([\theta,1]\), where \(\theta=\{0.5,0.6,0.7,0.8,0.9\}\).
library(AMOUNTAIN)
n = 100
k = 20
theta = 0.5
pp <- networkSimulation(n, k, theta)
moduleid <- pp[[3]]
netid <- 1:100
restp <- netid[-moduleid]
groupdesign <- list(moduleid,restp)
names(groupdesign) <- c('module','background')
The following figure shows the weighted co-expression network when \(n=100,k=20\) and red nodes indicate module members and wider edges mean larger similarities. Visualization is based on qgraph.
require(qgraph)
## Loading required package: qgraph
pg <- qgraph(pp[[1]],groups=groupdesign,legend=TRUE)