normalizeCounts {scater} | R Documentation |
Compute (log-)normalized expression values by dividing counts for each cell by the corresponding size factor.
normalizeCounts(x, ...) ## S4 method for signature 'ANY' normalizeCounts(x, size_factors = NULL, use_size_factors = NULL, log = TRUE, return_log = NULL, pseudo_count = 1, log_exprs_offset = NULL, center_size_factors = TRUE, subset_row = NULL, downsample = FALSE, down_target = NULL, down_prop = 0.01) ## S4 method for signature 'SummarizedExperiment' normalizeCounts(x, ..., exprs_values = "counts") ## S4 method for signature 'SingleCellExperiment' normalizeCounts(x, size_factors = NULL, ...)
x |
A numeric matrix-like object containing counts for cells in the columns and features in the rows. Alternatively, a SingleCellExperiment or SummarizedExperiment object containing such a count matrix. |
... |
For the generic, arguments to pass to specific methods. For the SummarizedExperiment method, further arguments to pass to the ANY or DelayedMatrix methods. For the SingleCellExperiment method, further arguments to pass to the SummarizedExperiment method. |
size_factors |
A numeric vector of cell-specific size factors.
Alternatively |
use_size_factors |
Deprecated, same as |
log |
Logical scalar indicating whether normalized values should be log2-transformed. |
return_log |
Deprecated, same as |
pseudo_count |
Numeric scalar specifying the pseudo_count to add when log-transforming expression values. |
log_exprs_offset |
Deprecated, same as |
center_size_factors |
Logical scalar indicating whether size factors should be centered at unity before being used. |
subset_row |
A vector specifying the subset of rows of |
downsample |
Logical scalar indicating whether downsampling should be performed prior to scaling and log-transformation. |
down_target |
Numeric scalar specifying the downsampling target when |
down_prop |
Numeric scalar between 0 and 1 indicating the quantile to use to define the downsampling target when |
exprs_values |
A string or integer scalar specifying the assay of |
Normalized expression values are computed by dividing the counts for each cell by the size factor for that cell.
This aims to remove cell-specific scaling biases, e.g., due to differences in sequencing coverage or capture efficiency.
If log=TRUE
, log-normalized values are calculated by adding pseudo_count
to the normalized count and performing a log2 transformation.
If no size factors are supplied, they are determined automatically from x
:
For count matrices and SummarizedExperiment inputs,
the sum of counts for each cell is used to compute a size factor via the librarySizeFactors
function.
For SingleCellExperiment instances, the function searches for sizeFactors
from x
.
If none are available, it defaults to library size-derived size factors.
If size_factors
are supplied, they will override any size factors present in x
.
If center_size_factors=TRUE
, size factors are centred at unity prior to calculation of normalized expression values.
This means that the computed expression values can be interpreted as being on the same scale as log-counts,
and that the value of pseudo_count
can be interpreted as being on the same scale as the counts.
It also ensures that abundances are roughly comparable between features normalized with different sets of size factors.
A matrix-like object of (log-)normalized expression values.
If downsample=TRUE
, counts for each cell are randomly downsampled according to their size factors prior to log-transformation.
This is occasionally useful for avoiding artifacts caused by scaling count data with a strong mean-variance relationship.
Each cell is downsampled according to the ratio between down_target
and that cell's size factor.
(Cells with size factors below the target are not downsampled and are directly scaled by this ratio.)
If log=TRUE
, a log-transformation is also performed after adding pseudo_count
to the downsampled counts.
Note that the normalized expression values in this mode cannot be interpreted as being on the same abundance as the original counts,
but instead have abundance equivalent to counts after downsampling to the target size factor.
This motivates the use of a fixed down_target
to ensure that expression values are comparable across different normalizeCounts
calls.
We automatically set down_target
to the 1st percentile of size factors across all cells involved in the analysis,
but this is only appropriate if the resulting expression values are only compared within the same call to normalizeCounts
.
If expression values are to be compared across multiple calls (e.g., in modelGeneVarWithSpikes
or multiBatchNorm
),
down_target
should be manually set to a constant target value that can be considered a low size factor in every call.
Aaron Lun
logNormCounts
, which wraps this function for convenient use with SingleCellExperiment instances.
downsampleMatrix
, to perform the downsampling.
example_sce <- mockSCE() normed <- normalizeCounts(example_sce) str(normed)