clusterSNNGraph {scran}R Documentation

Wrappers for graph-based clustering

Description

Perform graph-based clustering using community detection methods on a nearest-neighbor graph, where nodes represent cells or k-means centroids.

Usage

clusterSNNGraph(x, ...)

## S4 method for signature 'ANY'
clusterSNNGraph(
  x,
  ...,
  clusterFUN = cluster_walktrap,
  subset.row = NULL,
  transposed = FALSE,
  use.kmeans = FALSE,
  kmeans.centers = NULL,
  kmeans.args = list(),
  full.stats = FALSE
)

## S4 method for signature 'SummarizedExperiment'
clusterSNNGraph(x, ..., assay.type = "logcounts")

## S4 method for signature 'SingleCellExperiment'
clusterSNNGraph(x, ..., use.dimred = NULL)

clusterKNNGraph(x, ...)

## S4 method for signature 'ANY'
clusterKNNGraph(
  x,
  ...,
  clusterFUN = cluster_walktrap,
  subset.row = NULL,
  transposed = FALSE,
  use.kmeans = FALSE,
  kmeans.centers = NULL,
  kmeans.args = list(),
  full.stats = FALSE
)

## S4 method for signature 'SummarizedExperiment'
clusterKNNGraph(x, ..., assay.type = "logcounts")

## S4 method for signature 'SingleCellExperiment'
clusterKNNGraph(x, ..., use.dimred = NULL)

Arguments

x

A matrix-like object containing expression values for each gene (row) in each cell (column). These dimensions can be transposed if transposed=TRUE.

Alternatively, a SummarizedExperiment or SingleCellExperiment containing such an expression matrix.

If x is a SingleCellExperiment and use.dimred is set, graph building will be performed from its reducedDims.

...

For the generics, additional arguments to pass to the specific methods.

For the ANY methods, additional arguments to pass to buildSNNGraph or buildKNNGraph.

For the SummarizedExperiment method, additional arguments to pass to the corresponding ANY method.

For the SingleCellExperiment method, additional arguments to pass to the corresponding SummarizedExperiment method.

clusterFUN

Function that can take a graph object and return a communities object, see examples in the igraph package.

subset.row

See ?"scran-gene-selection". Only used when transposed=FALSE.

transposed

A logical scalar indicating whether x is transposed (i.e., rows are cells).

use.kmeans

Logical scalar indicating whether k-means clustering should be performed.

kmeans.centers

Integer scalar specifying the number of clusters to use for k-means clustering. Defaults to the square root of the number of cells in x.

kmeans.args

List containing additional named arguments to pass to kmeans.

full.stats

Logical scalar indicating whether to return more statistics regarding the k-means clustering.

assay.type

A string specifying which assay values to use.

use.dimred

A string specifying whether existing values in reducedDims(x) should be used.

Details

By default, these functions simply call buildSNNGraph or buildKNNGraph followed by the specified clusterFUN to generate a clustering. We use the Walktrap algorithm by default as it has a cool-sounding name, but users can feel free to swap it for some other algorithm (e.g., cluster_louvain).

For large datasets, we can perform vector quantization with k-means to create centroids that are then subjected to graph-based clustering. The label for each cell is then defined as the label of the centroid to which it was assigned. In this approach, k-means and graph-based clustering complement each other; the former improves computational efficiency and mitigates density-dependent dilation, while the latter aggregates the centroids for easier interpretation.

Value

If full.stats=FALSE, a factor is returned containing the cluster assignment for each cell.

If full.stats=TRUE and use.kmeans=TRUE, a DataFrame is returned with one row per cell. This contains the columns kmeans, specifying the assignment of each cell to a k-means cluster; and igraph, specifying the assignment of each cell to a graph-based cluster operating on the k-means clusters. In addition, the metadata contains graph, a graph object where each node is a k-means cluster; and membership, the graph-based cluster to which each node is assigned.

Author(s)

Aaron Lun

See Also

buildSNNGraph and buildKNNGraph, to build the nearest-neighbor graphs.

cluster_walktrap and related functions, to detect communities within those graphs.

Examples

library(scater)
sce <- mockSCE(ncells=500)
sce <- logNormCounts(sce)

clusters <- clusterSNNGraph(sce)
table(clusters)

# Can pass usual arguments to buildSNNGraph:
clusters2 <- clusterSNNGraph(sce, k=5)
table(clusters2)

# Works with low-dimensional inputs:
sce <- runPCA(sce, ncomponents=10)
clusters3 <- clusterSNNGraph(sce, use.dimred="PCA")
table(clusters3)

# Turn on k-means for larger datasets, e.g., 
# assuming we already have a PCA result:
set.seed(101010)
bigpc <- matrix(rnorm(2000000), ncol=20)
clusters4 <- clusterSNNGraph(bigpc, d=NA, use.kmeans=TRUE, transposed=TRUE)
table(clusters4)

# Extract the graph for more details:
clusters5 <- clusterSNNGraph(sce, use.dimred="PCA", 
    use.kmeans=TRUE, full.stats=TRUE)
head(clusters5)
metadata(clusters5)$graph


[Package scran version 1.16.0 Index]