indexCell {scmap} | R Documentation |
The method is based on product quantization for the cosine distance. Split the training data into M identically sized chunks by genes. Use k-means to find k subcentroids for each group. Assign cluster numbers to each member of the dataset.
indexCell(object = NULL, M = NULL, k = NULL) indexCell.SingleCellExperiment(object, M, k) ## S4 method for signature 'SingleCellExperiment' indexCell(object = NULL, M = NULL, k = NULL)
object |
an object of |
M |
number of chunks into which the expr matrix is split |
k |
number of clusters per group for k-means clustering |
a list of four objects: 1) a list of matrices containing the subcentroids of each group 2) a matrix containing the subclusters for each cell for each group 3) the value of M 4) the value of k
library(SingleCellExperiment) sce <- SingleCellExperiment(assays = list(normcounts = as.matrix(yan)), colData = ann) # this is needed to calculate dropout rate for feature selection # important: normcounts have the same zeros as raw counts (fpkm) counts(sce) <- normcounts(sce) logcounts(sce) <- log2(normcounts(sce) + 1) # use gene names as feature symbols rowData(sce)$feature_symbol <- rownames(sce) isSpike(sce, 'ERCC') <- grepl('^ERCC-', rownames(sce)) # remove features with duplicated names sce <- sce[!duplicated(rownames(sce)), ] sce <- selectFeatures(sce) sce <- indexCell(sce)