This R package supports the handling and analysis of imaging mass cytometry and other highly multiplexed imaging data. The main functionality includes reading in single-cell data after image segmentation and measurement, data formatting to perform channel spillover correction and a number of spatial analysis approaches. First, cell-cell interactions are detected via spatial graph construction; these graphs can be visualized with cells representing nodes and interactions representing edges. Furthermore, per cell, its direct neighbours are summarized to allow spatial clustering. Per image/grouping level, interactions between types of cells are counted, averaged and compared against random permutations. In that way, types of cells that interact more (attraction) or less (avoidance) frequently than expected by chance are detected.
imcRtools 1.14.0
This vignette gives an introduction to handling and analyzing imaging
mass cytometry (IMC) and other highly-multiplexed imaging data in R. The
imcRtools
package relies on expression and morphological features
extracted from multi-channel images using corresponding segmentation
masks. A description of data types and segmentation approaches can be
found below (data types, segmentation).
However, due to shared data structures, the functionalities of the
imcRtools
package are applicable to most highly multiplexed imaging
modalities.
The imcRtools
package exports functions and example data to perform
the following analyses:
To highlight the usability of these function, the imcRtools
package
also exports a number of test data files.
Highly multiplexed imaging techniques (Bodenmiller 2016) such as imaging
mass cytometry (IMC) (Giesen et al. 2014), multiplexed ion beam imaging (MIBI)
(Angelo et al. 2014) and cyclic immunofluorescence techniques (Lin et al. 2018, @Gut2018) acquire read-outs of the expression of tens of protein in a
spatially resolved manner. After technology-dependent data
pre-processing, the raw data files are comparable: multi-channel images
for the dimensions x
, y
, and c
, where x
and y
define the
number of pixels (x * y
) per image and c
the number of proteins
(also refered to as “markers”) measured per image. More info on the
data types and further pre-processing can
be found below.
Increased multiplexity compared to epitope-based techniques is achieved using single-cell resolved spatial transcriptomics techniques including MERFISH (Chen et al. 2015) and seqFISH (Lubeck et al. 2014). However, data produced by these techniques required different pre-processing steps. Nevertheless, analysis performed on the single-cell level is equally applicable.
IMC (Giesen et al. 2014) is a highly multiplexed imaging approach to measure spatial protein abundance. In IMC, tissue sections on slides are stained with a mix of around 40 metal-conjugated antibodies prior to laser ablation with \(1\mu{}m\) resolution. The ablated material is transferred to a mass cytometer for time-of-flight detection of the metal ions (Giesen et al. 2014). At an ablation frequency of 200Hz, a 1mm x 1mm region can be acquired within 1.5 hours.
In the case of IMC, the raw data output are .mcd files containing all acquired regions per slide. During image pre-prcoessing these files are converted into individual multi-channel .tiff and OME-TIFF files. These file formats are supported by most open-source and commercial image analysis software, such as Fiji, QuPath or napari.
In R, these .tiff files can be read into individual Image objects or combined into a CytoImageList object supported by the cytomapper package.
The pixel resolution of most highly multiplexed imaging technologies including IMC support the detection of cellular structures. Therefore, a common processing step using multi-channel images is object segmentation. In this vignette, objects are defined as cells; however, also larger scale structures could be segmented.
There are multiple different image segmentation approaches available.
However, imcRtools
only supports direct reader functions for two
segmentation strategies developed for highly multiplexed imaging
technologies:
The ImcSegmentationPipeline has been developed to give the user flexibility in how to perform channel selection and image segmentation. The underlying principle is to train a pixel classifier (using ilastik) on a number of selected channels to perform probabilistic classification of each pixel into: background, cytoplasm and nucleus. Based on these classification probabilities a CellProfiler pipeline performs cell segmentation and feature extraction.
A containerized version of this pipeline is implemented in
steinbock.
steinbock
further supports segmentation via the use of Mesmer
from the DeepCell
library (Greenwald et al. 2021).
While the output of both approaches is structured differently, the exported features are comparable:
By combining these with availabel channel information, the data can be read into a SpatialExperiment or SingleCellExperiment object (see below).
The imcRtools
package contains a number of example data generated by
the Hyperion imaging system for different purposes. The following
section gives an overview of these files.
To highlight the use of the imcRtools
package for spillover
correction, we provide four .txt files containing pixel intensities of
four spotted metals.
Please refer to the spillover correction section for full details.
These files are accessible via:
path <- system.file("extdata/spillover", package = "imcRtools")
list.files(path, recursive = TRUE)
## [1] "Dy161.txt" "Dy162.txt" "Dy163.txt" "Dy164.txt"
IMC generates .mcd files storing the raw data or all acquired regions of interest (ROI). In addition, the raw pixel values are also stored in individual .txt files per ROI.
To highlight reading in raw data in form of .txt files, the imcRtools
contains 3 sample acquisitions:
txt_files <- list.files(system.file("extdata/mockData/raw",
package = "imcRtools"))
txt_files
## [1] "20210305_NE_mockData2_ROI_001_1.txt" "20210305_NE_mockData2_ROI_002_2.txt"
## [3] "20210305_NE_mockData2_ROI_003_3.txt"
IMC data preprocessing and segmentation can be performed using the
ImcSegmentationPipeline.
It generates a number of .csv
files containing object/cell-specific
and image-specific metadata.
The imcRtools
package exports the read_cpout
function as convenient
reader function for outputs generated by the ImcSegmentationPipeline
.
For demonstration purposes, imcRtools
contains the output of running
the pipeline on a small example dataset:
path <- system.file("extdata/mockData/cpout", package = "imcRtools")
list.files(path, recursive = TRUE)
## [1] "Experiment.csv" "Image.csv"
## [3] "Object_relationships.csv" "cell.csv"
## [5] "panel.csv" "var_Image.csv"
## [7] "var_cell.csv"
The steinbock pipeline can be used to process, segment and extract features from IMC data. For more information, please refer to its documentation.
To highlight the functionality of imcRtools
to read in single-cell
data generated by steinbock
, we provide a small toy dataset available
at:
path <- system.file("extdata/mockData/steinbock", package = "imcRtools")
list.files(path, recursive = TRUE)
## [1] "images.csv"
## [2] "intensities/20210305_NE_mockData1_001.csv"
## [3] "intensities/20210305_NE_mockData1_002.csv"
## [4] "intensities/20210305_NE_mockData1_003.csv"
## [5] "intensities/20210305_NE_mockData2_001.csv"
## [6] "intensities/20210305_NE_mockData2_002.csv"
## [7] "intensities/20210305_NE_mockData2_003.csv"
## [8] "intensities/20210305_NE_mockData3_001.csv"
## [9] "intensities/20210305_NE_mockData3_002.csv"
## [10] "intensities/20210305_NE_mockData3_003.csv"
## [11] "intensities/20210305_NE_mockData4_001.csv"
## [12] "intensities/20210305_NE_mockData4_002.csv"
## [13] "intensities/20210305_NE_mockData4_003.csv"
## [14] "intensities/20210305_NE_mockData5_001.csv"
## [15] "intensities/20210305_NE_mockData5_002.csv"
## [16] "intensities/20210305_NE_mockData5_003.csv"
## [17] "neighbors/20210305_NE_mockData1_001.csv"
## [18] "neighbors/20210305_NE_mockData1_002.csv"
## [19] "neighbors/20210305_NE_mockData1_003.csv"
## [20] "neighbors/20210305_NE_mockData2_001.csv"
## [21] "neighbors/20210305_NE_mockData2_002.csv"
## [22] "neighbors/20210305_NE_mockData2_003.csv"
## [23] "neighbors/20210305_NE_mockData3_001.csv"
## [24] "neighbors/20210305_NE_mockData3_002.csv"
## [25] "neighbors/20210305_NE_mockData3_003.csv"
## [26] "neighbors/20210305_NE_mockData4_001.csv"
## [27] "neighbors/20210305_NE_mockData4_002.csv"
## [28] "neighbors/20210305_NE_mockData4_003.csv"
## [29] "neighbors/20210305_NE_mockData5_001.csv"
## [30] "neighbors/20210305_NE_mockData5_002.csv"
## [31] "neighbors/20210305_NE_mockData5_003.csv"
## [32] "panel.csv"
## [33] "regionprops/20210305_NE_mockData1_001.csv"
## [34] "regionprops/20210305_NE_mockData1_002.csv"
## [35] "regionprops/20210305_NE_mockData1_003.csv"
## [36] "regionprops/20210305_NE_mockData2_001.csv"
## [37] "regionprops/20210305_NE_mockData2_002.csv"
## [38] "regionprops/20210305_NE_mockData2_003.csv"
## [39] "regionprops/20210305_NE_mockData3_001.csv"
## [40] "regionprops/20210305_NE_mockData3_002.csv"
## [41] "regionprops/20210305_NE_mockData3_003.csv"
## [42] "regionprops/20210305_NE_mockData4_001.csv"
## [43] "regionprops/20210305_NE_mockData4_002.csv"
## [44] "regionprops/20210305_NE_mockData4_003.csv"
## [45] "regionprops/20210305_NE_mockData5_001.csv"
## [46] "regionprops/20210305_NE_mockData5_002.csv"
## [47] "regionprops/20210305_NE_mockData5_003.csv"
## [48] "steinbock.sh"
The example data related to the ImcSegmentationPipeline
and
steinbock
can also be accessed
online.
The imcRtools
package supports reading in data generated by the
ImcSegmentationPipeline
or steinbock pipeline.
To read in the outpout data into a
SpatialExperiment or
SingleCellExperiment, the imcRtools
package
exports the read_cpout
function.
By default, the single-cell data is read into a
SpatialExperiment object. Here, the extracted
channel- and cell-specific intensities are stored in the counts(spe)
slot. All morphological features are stored in colData(spe)
and the
spatial locations of the cells are stored in spatialCoords(spe)
. The
interaction graph is stored in colPair(spe, "neighbourhood")
.
Alternatively, the data can be read into a
SingleCellExperiment object. The only
difference is the lack of spatialCoords(sce)
. Here, the spatial
coordinates are stored in colData(spe)$Pos_X
and colData(spe)$Pos_Y
.
The
ImcSegmentationPipeline
produces a number of output files. By default, all single-cell features
are measured and exported. To browse and select the features of
interest, the imcRtools
package provides the show_cpout_features
function:
path <- system.file("extdata/mockData/cpout", package = "imcRtools")
show_cpout_features(path)
By default, read_cpout
will read in the mean intensity per channel and
cell from “hot pixel” filtered image stacks specified via
intensities = "Intensity_MeanIntensity_FullStackFiltered"
. Please
refer to ?read_cpout
for the full documentation.
cur_path <- system.file("extdata/mockData/cpout", package = "imcRtools")
# Read as SpatialExperiment
(spe <- read_cpout(cur_path, graph_file = "Object_relationships.csv"))
## class: SpatialExperiment
## dim: 5 239
## metadata(0):
## assays(1): counts
## rownames(5): Ag107 Pr141 Sm147 Eu153 Yb172
## rowData names(7): Tube.Number Metal.Tag ... deepcell cellpose
## colnames(239): 1_1 1_2 ... 15_29 15_30
## colData names(12): sample_id ObjectNumber ... Metadata_acid
## Metadata_description
## reducedDimNames(0):
## mainExpName: NULL
## altExpNames(0):
## spatialCoords names(2) : Pos_X Pos_Y
## imgData names(1): sample_id
# Read as SingleCellExperiment
(sce <- read_cpout(cur_path, graph_file = "Object_relationships.csv",
return_as = "sce"))
## class: SingleCellExperiment
## dim: 5 239
## metadata(0):
## assays(1): counts
## rownames(5): Ag107 Pr141 Sm147 Eu153 Yb172
## rowData names(7): Tube.Number Metal.Tag ... deepcell cellpose
## colnames(239): 1_1 1_2 ... 15_29 15_30
## colData names(14): sample_id ObjectNumber ... Metadata_acid
## Metadata_description
## reducedDimNames(0):
## mainExpName: NULL
## altExpNames(0):
Single-cell data and all associated metadata (e.g. spatial location,
morphology and interaction graphs) as produced by the
steinbock pipeline can
be read in using the read_steinbock
function:
cur_path <- system.file("extdata/mockData/steinbock", package = "imcRtools")
# Read as SpatialExperiment
(spe <- read_steinbock(cur_path))
## class: SpatialExperiment
## dim: 5 404
## metadata(0):
## assays(1): counts
## rownames(5): Ag107 Cytokeratin 5 Laminin YBX1 H3K27Ac
## rowData names(7): channel name ... cellpose Tube.Number
## colnames(404): 20210305_NE_mockData1_001_1 20210305_NE_mockData1_001_2
## ... 20210305_NE_mockData5_003_39 20210305_NE_mockData5_003_40
## colData names(8): sample_id ObjectNumber ... width_px height_px
## reducedDimNames(0):
## mainExpName: NULL
## altExpNames(0):
## spatialCoords names(2) : Pos_X Pos_Y
## imgData names(1): sample_id
# Read as SingleCellExperiment
(sce <- read_steinbock(cur_path, return_as = "sce"))
## class: SingleCellExperiment
## dim: 5 404
## metadata(0):
## assays(1): counts
## rownames(5): Ag107 Cytokeratin 5 Laminin YBX1 H3K27Ac
## rowData names(7): channel name ... cellpose Tube.Number
## colnames(404): 20210305_NE_mockData1_001_1 20210305_NE_mockData1_001_2
## ... 20210305_NE_mockData5_003_39 20210305_NE_mockData5_003_40
## colData names(10): sample_id ObjectNumber ... width_px height_px
## reducedDimNames(0):
## mainExpName: NULL
## altExpNames(0):
For more information, please refer to ?read_steinbock
.
For reading in and visualization of multi-channel images and
segmentation masks, please refer to the
cytomapper package. The imcRtools
package
however supports reading in raw .txt files generated by the Hyperion
imaging system into a CytoImageList
object; a data container exported
by cytomapper
.
The user needs to provide a path from which all .txt files will be read in:
path <- system.file("extdata/mockData/raw", package = "imcRtools")
cur_CytoImageList <- readImagefromTXT(path)
cur_CytoImageList
## CytoImageList containing 3 image(s)
## names(3): 20210305_NE_mockData2_ROI_001_1 20210305_NE_mockData2_ROI_002_2 20210305_NE_mockData2_ROI_003_3
## Each image contains 5 channel(s)
## channelNames(5): Ag107Di Pr141Di Sm147Di Eu153Di Yb172Di
By specifying the pattern
argument, individual or a subset of files
can be read in. For more information, please refer to
?readImagefromTXT
.
When acquiring IMC images, pixel intensities can be influenced by spillover from neighboring channels. To correct for this, Chevrier et al. have developed a staining protocol to acquire individually spotted metal isotopes (Chevrier et al. 2017). Based on these measurements, spillover into neighboring channels can be quantified to correct pixel intensities.
The imcRtools
package provides helper functions that facilitate the
correction of spillover for IMC data. For a full tutorial, please refer
to the IMC data analysis
book.
In the first step, the pixel intensities of individually spotted metals
need to be read into a SingleCellExperiment
container for downstream
use with the CATALYST package. For this, the
readSCEfromTXT
function can be used:
path <- system.file("extdata/spillover", package = "imcRtools")
sce <- readSCEfromTXT(path)
## Spotted channels: Dy161, Dy162, Dy163, Dy164
## Acquired channels: Dy161, Dy162, Dy163, Dy164
## Channels spotted but not acquired:
## Channels acquired but not spotted:
sce
## class: SingleCellExperiment
## dim: 4 400
## metadata(0):
## assays(1): counts
## rownames(4): 161Dy(Dy161Di) 162Dy(Dy162Di) 163Dy(Dy163Di)
## 164Dy(Dy164Di)
## rowData names(2): channel_name marker_name
## colnames(400): Dy161.1 Dy161.2 ... Dy164.99 Dy164.100
## colData names(9): Start_push End_push ... sample_metal sample_mass
## reducedDimNames(0):
## mainExpName: NULL
## altExpNames(0):
Here, the example metal spot files are read in. The spot information are
stored in the colData(sce)
slot and channel information are stored in
rowData(sce)
. Each column represents a single pixel.
In the next step, it is crucial to identify potentially mislabeled spots
or spots with low pixel intensities. The imcRtools
package exports the
plotSpotHeatmap
function, which visualizes the aggregated (default
median
) pixel intensities per spot and per metal:
plotSpotHeatmap(sce)
Here, high median pixel intensities can be observed in each spot and
their corresponding channels (visualized on the log10
scale by
default). To quickly identify spot/channel combinations with low signal,
the threshold
parameter can be set:
plotSpotHeatmap(sce, log = FALSE, threshold = 200)
If pixel intensities are low, spillover estimation might not be robust.
Therefore, the binAcrossPixels
function can be used to sum consecutive
pixels and enhance the acquired signal. This step is optional for
spillover estimation.
sce2 <- binAcrossPixels(sce, bin_size = 5)
plotSpotHeatmap(sce2, log = FALSE, threshold = 200)
Prior to spillover estimation, the CATALYST
package provides the assignPrelim
, estCutoffs
and applyCutoffs
functions to estimate the spotted mass for each pixel based on their
channel intensities. For more information on the spillover estimation
and correction, please refer to the CATALYST
vignette.
This estimation can be used to identify pixels that cannot be easily
assigned to their spotted mass, potentially indicating pixels with weak
signal. To remove these pixels, the filterPixels
function can be used.
This function further removes pixels assigned to masses, which only
contain very few pixels.
library(CATALYST)
bc_key <- as.numeric(unique(sce$sample_mass))
assay(sce, "exprs") <- asinh(counts(sce)/5)
sce <- assignPrelim(sce, bc_key = bc_key)
sce <- estCutoffs(sce)
sce <- applyCutoffs(sce)
# Filter out mislabeled pixels
sce <- filterPixels(sce)
table(sce$bc_id, sce$sample_mass)
##
## 161 162 163 164
## 0 0 0 0 2
## 161 100 0 0 0
## 162 0 100 0 0
## 163 0 0 100 0
## 164 0 0 0 98
Finally, the pre-processed SiingleCellExperiment
object can be used to
generate the spillover matrix using the CATALYST::computeSpillmat
function:
sce <- computeSpillmat(sce)
metadata(sce)$spillover_matrix
## Dy161Di Dy162Di Dy163Di Dy164Di
## Dy161Di 1.000000000 0.031443129 0.009734712 0.006518048
## Dy162Di 0.015715159 1.000000000 0.048116187 0.008250039
## Dy163Di 0.003809504 0.012159704 1.000000000 0.020214651
## Dy164Di 0.005058069 0.008457546 0.028912343 1.000000000
This spillover matrix can be directly applied to compensate the summarized pixel intensities per cell and per channel as described here.
The following section will highlight functions for spatial analyses of the data.
When following the ImcSegmentationPipeline
or steinbock
and reading
in the data using the corresponding functions, the generated graphs are
automatically stored in the colPair(spe, "neighborhood")
slot.
Alternatively, the buildSpatialGraph
function in the imcRtools
package constructs interaction graphs using either (i) cell-centroid
expansion, (ii) k-nearest neighbor search or (iii) delaunay
triangulation.
library(cytomapper)
data("pancreasSCE")
pancreasSCE <- buildSpatialGraph(pancreasSCE, img_id = "ImageNb",
type = "expansion",
threshold = 20)
pancreasSCE <- buildSpatialGraph(pancreasSCE, img_id = "ImageNb",
type = "knn",
k = 5)
pancreasSCE <- buildSpatialGraph(pancreasSCE, img_id = "ImageNb",
type = "delaunay")
colPairNames(pancreasSCE)
## [1] "expansion_interaction_graph" "knn_interaction_graph"
## [3] "delaunay_interaction_graph"
When setting type = "knn"
, by default a directional graph will be
build. Setting directed = FALSE
will create bi-directional edges for
each pair of cells that are connected by at least one edge in the
directed setting.
The cells’ locations and constructed graphs can be visualized using the
plotSpatial
function. Here, cells are referred to as “nodes” and
cell-cell interactions are referred to as “edges”. All visual attributes
of the nodes and edges can be set. Either by specifying a variable in
colData(spe)
, a marker name or a single entry using the *_fix
parameters. By default the plotSpatial
function will visualize equal physical
units on the x- and y-axis with an aspect ratio of 1. The example data are
located in different regions of an image and we therefore set scales = "free"
for simpler visualization.
library(ggplot2)
library(ggraph)
plotSpatial(pancreasSCE,
img_id = "ImageNb",
node_color_by = "CellType",
node_shape_by = "ImageNb",
node_size_by = "Area",
draw_edges = TRUE,
colPairName = "knn_interaction_graph",
directed = FALSE,
scales = "free")
# Colored by expression and with arrows
plotSpatial(pancreasSCE,
img_id = "ImageNb",
node_color_by = "PIN",
assay_type = "exprs",
node_size_fix = 3,
edge_width_fix = 0.2,
draw_edges = TRUE,
colPairName = "knn_interaction_graph",
directed = TRUE,
arrow = grid::arrow(length = grid::unit(0.1, "inch")),
end_cap = ggraph::circle(0.05, "cm"),
scales = "free")
# Subsetting the SingleCellExperiment
plotSpatial(pancreasSCE[,pancreasSCE$Pattern],
img_id = "ImageNb",
node_color_by = "CellType",
node_size_fix = 1,
draw_edges = TRUE,
colPairName = "knn_interaction_graph",
directed = TRUE)
The returned object can be further modified using the ggplot2
logic.
This includes changing the node color, shape and size using
scale_color_*
, scale_shape_*
and scale_size_*
. Edge attributes can
be altered using the scale_edge_*
function exported by ggraph
,
The aggregateNeighbors
function can be used to aggregate features of
all neighboring cells for each individual cell. This function operates
in two settings.
metadata
: when aggregating by cell-specific metadata, the function
computes the relative frequencies of all entries to
colData(sce)[[count_by]]
within the direct neighborhood of each
cell.expression
: the expression counts of neighboring cells are
aggregated using the specified statistic
(defaults to mean
).Each cell’s neighborhood is defined as endpoints of edges stored in
colPair(sce, colPairName)
.
pancreasSCE <- aggregateNeighbors(pancreasSCE,
colPairName = "knn_interaction_graph",
aggregate_by = "metadata",
count_by = "CellType")
head(pancreasSCE$aggregatedNeighbors)
## DataFrame with 6 rows and 3 columns
## celltype_A celltype_B celltype_C
## <numeric> <numeric> <numeric>
## 1 0 0.0 1.0
## 2 0 0.2 0.8
## 3 0 0.0 1.0
## 4 0 0.0 1.0
## 5 0 0.0 1.0
## 6 0 0.0 1.0
pancreasSCE <- aggregateNeighbors(pancreasSCE,
colPairName = "knn_interaction_graph",
aggregate_by = "expression",
assay_type = "exprs")
head(pancreasSCE$mean_aggregatedExpression)
## DataFrame with 6 rows and 5 columns
## H3 CD99 PIN CD8a CDH
## <numeric> <numeric> <numeric> <numeric> <numeric>
## 1 2.32500 0.860329 0.092871 0.725000 2.51264
## 2 2.88022 1.629762 0.319527 0.207873 2.46486
## 3 3.10829 0.735389 0.190616 0.255515 1.89484
## 4 2.55842 0.773342 0.124545 0.188629 2.51084
## 5 2.44287 1.126240 0.252129 0.200261 2.61336
## 6 2.65059 0.903869 0.181792 0.196691 2.16434
The returned entries can now be used for clustering to group cells based on their environment (either by aggregated categorical features or expression).
set.seed(22)
cur_cluster <- kmeans(pancreasSCE$aggregatedNeighbors, centers = 3)
pancreasSCE$clustered_neighbors <- factor(cur_cluster$cluster)
# Visualize CellType and clustered_neighbors
plotSpatial(pancreasSCE,
img_id = "ImageNb",
node_color_by = "CellType",
node_size_fix = 4,
edge_width_fix = 2,
edge_color_by = "clustered_neighbors",
draw_edges = TRUE,
colPairName = "knn_interaction_graph",
directed = FALSE,
nodes_first = FALSE,
scales = "free") +
scale_color_brewer(palette = "Set2") +
scale_edge_color_brewer(palette = "Set1")
# Visualize clustered_neighbors
plotSpatial(pancreasSCE,
img_id = "ImageNb",
node_color_by = "clustered_neighbors",
node_size_fix = 4,
edge_width_fix = 1,
draw_edges = TRUE,
colPairName = "knn_interaction_graph",
directed = FALSE,
nodes_first = FALSE,
scales = "free")+
scale_color_brewer(palette = "Set1")
The single cell assignments derived from clustering cells based on their environment can be interpreted as cellular neighborhoods (CNs), which can represent sites of unique local processes (Schürch et al. 2020).
Downstream of CNs, imcRtools exports three functions to detect and analyze the spatial context (SC) of each cell.
detectSpatialContext
: for the function to detect SCsfilterSpatialContext
: for the function to filter SCsplotSpatialContext
: for the function to plot SC graphsThe term SC was coined by Bhate and colleagues (Bhate et al. 2022) and describes tissue regions in which distinct CNs may be interacting, which can lead to specialized local biological events.
The detectSpatialContext
function relies on CN fractions for each cell in
a spatial interaction graph (originally a k-nearest neighbor (KNN) graph).
We can retrieve the CN fractions using the above-described buildSpatialGraph
and aggregateNeighbors
functions. The window size (k for KNN) for
buildSpatialGraph
should reflect a length scale on which biological signals
can be exchanged and depends, among others, on cell density and tissue area.
In view of their divergent functionality, we recommend to use a larger
window size for SC (interaction between local processes) than for CN
(local processes) detection.
Subsequently, the CN fractions are sorted from high-to-low and the SC of each cell is assigned the minimal combination of SCs that additively surpass a user-defined threshold. The default threshold of 0.9 aims to represent the dominant CNs, hence the most prevalent signals, in a given window.
For more details, please refer to (Bhate et al. 2022).
# Generate k-nearest neighbor graph
pancreasSCE <- buildSpatialGraph(pancreasSCE, img_id = "ImageNb",
type = "knn",
name = "knn_spatialcontext_graph",
k = 15)
colPairNames(pancreasSCE)
## [1] "expansion_interaction_graph" "knn_interaction_graph"
## [3] "delaunay_interaction_graph" "knn_spatialcontext_graph"
# Aggregate based on clustered_neighbors
pancreasSCE <- aggregateNeighbors(pancreasSCE,
colPairName = "knn_spatialcontext_graph",
aggregate_by = "metadata",
count_by = "clustered_neighbors",
name = "aggregatedNeighborhood")
# Detect spatial contexts
pancreasSCE <- detectSpatialContext(pancreasSCE,
entry = "aggregatedNeighborhood",
threshold = 0.9,
name = "spatial_context")
# Define SC color scheme
col_SC <- setNames(c("#A6CEE3", "#1F78B4", "#B2DF8A", "#33A02C", "#FB9A99", "#E31A1C", "#FDBF6F"),
sort(unique(pancreasSCE$spatial_context)))
# Visualize spatial contexts on images
plotSpatial(pancreasSCE,
img_id = "ImageNb",
node_color_by = "spatial_context",
node_size_fix = 4,
edge_width_fix = 1,
draw_edges = TRUE,
colPairName = "knn_spatialcontext_graph",
directed = FALSE,
nodes_first = FALSE,
scales = "free") +
scale_color_manual(values = col_SC)
After SC assignment for each individual cell, the filterSpatialContext
function allows to filter detected SCs based on user-defined thresholds
for number of group entries (usually image or patient ID) and/or total
number of cells per SC.
In addition to a new column entry to the colData(object)
, the function
also returns a data.frame
entry to metadata(object)
containing filtered group
and cell counts per SC.
# Filter spatial contexts
# By number of group entries
pancreasSCE <- filterSpatialContext(pancreasSCE,
entry = "spatial_context",
group_by = "ImageNb",
group_threshold = 2,
name = "spatial_context_filtered_group")
metadata(pancreasSCE)$filterSpatialContext
## spatial_context n_cells n_group
## 1 1 87 2
## 2 1_2 71 2
## 4 1_3 90 2
## 5 2 55 2
## 7 3 14 2
plotSpatial(pancreasSCE,
img_id = "ImageNb",
node_color_by = "spatial_context_filtered_group",
node_size_fix = 4,
edge_width_fix = 1,
draw_edges = TRUE,
colPairName = "knn_spatialcontext_graph",
directed = FALSE,
nodes_first = FALSE,
scales = "free") +
scale_color_manual(values = col_SC)
# By total number of cells
pancreasSCE <- filterSpatialContext(pancreasSCE,
entry = "spatial_context",
group_by = "ImageNb",
cells_threshold = 15,
name = "spatial_context_filtered_cells")
metadata(pancreasSCE)$filterSpatialContext
## spatial_context n_cells n_group
## 1 1 87 2
## 2 1_2 71 2
## 3 1_2_3 16 1
## 4 1_3 90 2
## 5 2 55 2
## 6 2_3 29 1
plotSpatial(pancreasSCE,
img_id = "ImageNb",
node_color_by = "spatial_context_filtered_cells",
node_size_fix = 4,
edge_width_fix = 1,
draw_edges = TRUE,
colPairName = "knn_spatialcontext_graph",
directed = FALSE,
nodes_first = FALSE,
scales = "free") +
scale_color_manual(values = col_SC)
Lastly, the plotSpatialContext
function plots directed SC graphs,
akin to CN combination maps in (Bhate et al. 2022), based on symbolic
edge-lists and vertex metadata, which operates on cohort-level.
## Plot spatial context graph
# Default
plotSpatialContext(pancreasSCE,
entry = "spatial_context",
group_by = "ImageNb")
# Colored by name and size by n_cells
plotSpatialContext(pancreasSCE,
entry = "spatial_context",
group_by = "ImageNb",
node_color_by = "name",
node_size_by = "n_cells",
node_label_color_by = "name")
# Colored by n_cells and size by n_group
plotSpatialContext(pancreasSCE,
entry = "spatial_context",
group_by = "ImageNb",
node_color_by = "n_cells",
node_size_by = "n_group",
node_label_color_by = "n_cells")+
scale_color_viridis()