1 Installation

Install the package using Bioconductor. Start R and enter:

if(!requireNamespace("BiocManager", quietly = TRUE))
        install.packages("BiocManager")
BiocManager::install("MouseAgingData")

2 Setup

Now, load the package and dependencies used in the vignette.

library(scater)
library(MouseAgingData)

3 Introduction

Single-cell sequencing technology can reveal intricate details about individual cells, allowing researchers to interrogate the genetic make up of cells within a heterogeneous sample. Single-cell sequencing can provide insights into various aspects of cellular biology, such as characterization of cell populations, identification of rare cell types, and quantification of expression levels in cell types across experimental treatments. Given the wide utility, single-cell sequencing has expanded scientific knowledge in various fields, including cancer research, immunology, developmental biology, neurobiology, and microbiology.

There are several methods for generating single-cell sequencing data which can extract information (DNA or RNA) from a cell. These include, but are not limited to:

  1. Droplet-based platforms: such as 10x Genomics Chromium system, inDrop, Drop-seq, and Seq-Well, which use microfluidic devices to isolate individual cells into tiny droplets along with unique barcoded beads.

  2. Plate or microwell-based methods: such as the Smart-seq2 protocol or the C1 system by Fluidigm, respectively. These platforms employ microfluidic chips or multi-well arrays to capture and process individual cells. Unlike droplet-based platforms, these cells are manually or automatically sorted into individual wells of the plate.

The MouseAgingData package provides analysis-ready data from an aging mouse brain parabiosis single cell study by Ximerakis & Holton et al., (2023) and additional datasets. The contents of the package can be accessed by querying ExperimentHub with the package name.

4 Data

Ximerakis & Holton et al. investigated how heterochronic parabiosis (joining of the circulatory systems) affects the mouse brain in terms of aging and rejuvenation. They identified gene signatures attributed to aging in specific cell-types. They focus especially on brain endothelial cells, which showed dynamic transcriptional changes that affect vascular structure and function.

The parabiosis single cell RNA-seq (Ximerakis, Holton et al Nature Aging 2023) includes 105,329 cells, 31 cell types across 8 OX, 8 YX, 7 YY, 9 YO, 7 OO, 11 OY animals, and 20905 features.

This vignette demonstrates how to access and visualize the droplet data using reduced dimensionality coordinates provided by the authors.

5 Load the data set from ExperimentHub

sce <- parabiosis10x()
#> see ?MouseAgingData and browseVignettes('MouseAgingData') for documentation
#> loading from cache

View the SingleCellExperiment data.

sce
#> class: SingleCellExperiment 
#> dim: 20905 105329 
#> metadata(1): cell_colors
#> assays(1): counts
#> rownames(20905): Xkr4 Gm37381 ... DHRSX CAAA01147332.1
#> rowData names(2): geneID HVG
#> colnames: NULL
#> colData names(12): barcode nCount_RNA ... cell_ontology_class
#>   cell_ontology_id
#> reducedDimNames(3): PCA UMAP TSNE
#> mainExpName: NULL
#> altExpNames(0):

Do some checking to make sure the data loaded correctly and is what we expected. Here, we are viewing the cell information of the object. We see that there are indeed 105329 cells and 20905 features.

head(colData(sce)) 
#> DataFrame with 6 rows and 12 columns
#>            barcode nCount_RNA nFeature_RNA   animal    batch animal_type
#>        <character>  <numeric>    <integer> <factor> <factor>    <factor>
#> 1 AAACCTGGTCAGTGGA    2100.06          815     OO1L   Batch1          OO
#> 2 AAACCTGGTGTCAATC    4356.88         3120     OO1L   Batch1          OO
#> 3 AAACCTGTCAAACCAC    2679.97         1208     OO1L   Batch1          OO
#> 4 AAACCTGTCGTTACAG    3647.74         2137     OO1L   Batch1          OO
#> 5 AAACGGGCACGAGAGT    1904.85          703     OO1L   Batch1          OO
#> 6 AAAGATGAGCGTAGTG    3732.96         2247     OO1L   Batch1          OO
#>   percent_mito percent_ribo cell_type subpopulation
#>      <numeric>    <numeric>  <factor>      <factor>
#> 1     1.253203      5.81833     OPC         qOPC   
#> 2     0.510883      3.48925     NendC       NendC_3
#> 3     0.789625      3.67955     OPC         qOPC   
#> 4     0.607773      3.99532     GABA        GABA_3 
#> 5     1.746996      8.52778     EC          EC_1   
#> 6     0.652196      3.85105     GABA        GABA_13
#>              cell_ontology_class cell_ontology_id
#>                         <factor>         <factor>
#> 1 oligodendrocyte precursor cell       CL_0002453
#> 2 neuroendocrine cell                  CL_0000165
#> 3 oligodendrocyte precursor cell       CL_0002453
#> 4 GABAergic neuron                     CL_0000617
#> 5 endothelial cell                     CL_0000115
#> 6 GABAergic neuron                     CL_0000617

6 Visualization

For this dataset, the authors have provided us with their exact UMAP and tSNE coordinates, as well as their color scheme representing the cell types from their paper. This can be accessed in the metadata slot of the SingleCellExperiment object with the metadata() function. To consistently recreate their figures, let’s plot using their provided reduced dimensionality coordinates.

cell.color <- metadata(sce)$cell_color

gg <- plotUMAP(sce, color_by = "cell_type", text_by = "cell_type") 
gg + theme(legend.title=element_blank()) + 
    scale_color_manual(values=c(cell.color))
#> Scale for colour is already present.
#> Adding another scale for colour, which will replace the existing scale.