scater-red-dim-args {scater}R Documentation

Dimensionality reduction options

Description

An overview of the common options for dimensionality reduction methods in scater. The following sections consider an input x to the various run* methods, where x can be a numeric matrix or a SingleCellExperiment.

Feature selection

This section is relevant if x is a numeric matrix of (log-)expression values with features in rows and cells in columns; or if x is a SingleCellExperiment and dimred=NULL. In the latter, the expression values are obtained from the assay specified by exprs_values.

The subset_row argument specifies the features to use in a dimensionality reduction algorithm. This can be set to any user-defined vector containing, e.g., highly variable features or genes in a pathway of interest. It can be a character vector of row names, an integer vector of row indices or a logical vector.

If subset_row=NULL, the ntop features with the largest variances are used instead. This literally computes the variances from the expression values without considering any mean-variance trend. Note that the value of ntop is ignored if subset_row is specified.

If scale=TRUE, the expression values for each feature are standardized so that their variance is unity. This will also remove features with standard deviations below 1e-8.

Using reduced dimensions

This section is relevant if x is a SingleCellExperiment and dimred is not NULL.

All dimensionality reduction methods can be applied on existing dimensionality reduction results in x by setting dimred. This is typically used to run non-linear algorithms like t-SNE or UMAP on the PCA results. It may also be desirable in cases where the existing reduced dimensions are computed from a priori knowledge (e.g., gene set scores). In such cases, further reduction with PCA could be used to compress the data.

The matrix of existing reduced dimensions is taken from reducedDims(x, dimred). By default, all dimensions are used to compute the second set of reduced dimensions. If n_dimred is also specified, only the first n_dimred columns are used. Alternatively, n_dimred can be an integer vector specifying the column indices of the dimensions to use.

When dimred is specified, no additional feature selection or standardization is performed. This means that any settings of ntop, subset_row and scale are ignored.

Transposed inputs

This section is relevant if x is a numeric matrix and transposed=TRUE, such that cells are the rows and the various dimensions are the columns.

Here, the aim is to allow users to manually pass in dimensionality reduction results without needing to wrap them in a SingleCellExperiment. As such, no feature selection or standardization is performed, i.e., ntop, subset_row and scale are ignored.

Alternative experiments

This section is relevant if x is a SingleCellExperiment and altexp is not NULL.

If altexp is specified, the method is run on data from an alternative SummarizedExperiment nested within x. This is useful for performing dimensionality reduction on other features stored in altExp(x, altexp), e.g., antibody tags.

Setting altexp with exprs_values will use the specified assay from the alternative SummarizedExperiment. If the alternative is a SingleCellExperiment, setting dimred will use the specified dimensionality reduction results from the alternative. This option will also interact as expected with n_dimred.

Note that the output is still stored in the reducedDims of the output SingleCellExperiment. It is advisable to use a different name to distinguish this from PCA results obtained from the main experiment's assay values.

Author(s)

Aaron Lun

See Also

These arguments are used throughout runPCA, runTSNE, runUMAP, runMDS and runDiffusionMap.


[Package scater version 1.14.0 Index]