scater-vis-var {scater} | R Documentation |
A number of scater functions accept a SingleCellExperiment object and extract (meta)data from it for use in a plot.
These values are then used on the x- or y-axes (e.g., plotColData
) or for tuning visual parameters, e.g.,
colour_by
, shape_by
, size_by
.
This page describes how the selection of these values can be controlled by the user,
by passing appropriate values to the arguments of the desired plotting function.
Here, we assume that each visual feature of interest (e.g., point or line) corresponds to a cell in the SingleCellExperiment object sce
.
We will also assume that the user wants to change the colour of each feature according to the cell (meta)data.
To do so, the user can pass as an argument:
An unnamed character vector of length 1, i.e., a string.
This is initially assumed to be the name of a column-level metadata field.
The function will first search the column names of colData(sce)
, and extract metadata for all cells if a matching field is found.
If no match is found, the function will assume that the string represents a gene name.
It will search rownames(sce)
and extract gene expression values for any matching row across all cells.
Otherwise, an error is raised.
A named character vector of length 1, where the name is either "exprs"
or "metadata"
.
This forces the function to only search for the string in rownames(sce)
or colnames(colData(sce))
, respectively.
Adding an explicit name is useful when the same field exists in both the row names and column metadata names.
A character vector of length greater than 1.
This will search for nested fields in colData(sce)
.
For example, supplying a character vector c("A", "B", "C")
will retrieve colData(sce)$A$B$C
, where both A
and B
contain nested DataFrames.
See calculateQCMetrics
with compact=TRUE
for an example of how these can be constructed.
The concatenated name "A:B:C"
will be used in the legend.
A character vector of length greater than 1 and the first element set to NA
.
This will search for nested fields in the internal column data of a SingleCellExperiment, i.e., in int_colData
.
For example, c(NA, "size_factor")
would retrieve the values corresponding to sizeFactors(object)
.
The concatenated name without the NA
is used in the legend.
Note that internal fields are only searched when NA
is the first element.
A data frame with one column and number of rows equal to the number of cells. This should contain values to use for visualization, e.g., for plotting on the x-/y-axis, or for colouring by. In this manner, the user can use new information without manually adding it to the SingleCellExperiment object. The column name of the data frame will be used in the legend.
The same logic applies for other visualization parameters such as shape_by
and size_by
.
Other arguments may also use the same scheme, but this depends on the context; see the documentation for each function for details.
In particular, if an argument explicitly refers to a metadata field, any names for the character string will be ignored.
Similarly, a character vector of length > 1 is not allowed for an argument that explicitly refers to expression values.
Here, we assume that each visual feature of interest (e.g., point or line) corresponds to a feature in the SingleCellExperiment object sce
.
The scheme is mostly the same as described above, with a few differences:
rowData
is searched instead of colData
, as we are extracting metadata for each feature.
When extracting expression values, the name of a single cell must be specified. Visualization will then use the expression profile for all features in that cell. (This tends to be a rather unusual choice for colouring.)
Character strings named with "exprs"
will search for the string in colnames(sce)
.
A data frame input should have number of rows equal to the number of features.
Most functions will have a by_exprs_values
parameter.
This defines the assay of the SingleCellExperiment object from which expression values are extracted for use in colouring, shaping or sizing the points.
The setting of by_exprs_values
will usually default to "logcounts"
, or to the value of exprs_values
in functions such as plotExpression
.
However, it can be specified separately from exprs_values
, which is useful for visualizing two different types of expression values on the same plot.
Most functions will also have a by_show_single
parameter.
If FALSE
, variables with only one level are not used for visualization, i.e., the visual aspect (colour or shape or size) is set to the default for all points.
No guide is created for this aspect, avoiding clutter in the legend when that aspect provides no information.
If TRUE
, all supplied variables are used for visualization, regardless of how many levels they have.
plotColData
,
plotRowData
,
plotReducedDim
,
plotExpression
,
plotPlatePosition
,
and most other plotting functions.