1 Introduction

Palatine tonsils are under constant exposure to antigens via the upper respiratory tract, which makes them a compelling model secondary lymphoid organ (SLO) to study the interplay between innate and adaptive immune cells (Ruddle and Akirav 2009). Tonsils have a non-keratinizing stratified squamous epithelium, organized into tubular, branched crypts that enlarge the tonsillar surface. Within the crypts, microfold cells (or M cells) sample antigens at their apical membrane. Subsequently, antigen presenting cells (APC), such as dendritic cells (DC), process and present antigens to T cells in the interfollicular or T cell zone. Alternatively, antigens are kept intact by follicular dendritic cells (FDC) in lymphoid follicles, where they are recognized by B cells (Nave, Gebert, and Pabst 2001). Such recognition triggers the GC reaction, whereby naive B cells (NBC) undergo clonal selection, proliferation, somatic hypermutation, class switch recombination (CSR) and differentiation into long-lived plasma cells (PC) or memory B cells (MBC) (De Silva and Klein 2015).

In the context of the Human Cell Atlas (HCA) (Regev et al. 2018), we have created a taxonomy of 121 cell types and states in a human tonsil. Since the transcriptome is just a snapshot of a cell’s state (Wagner, Regev, and Yosef 2016), we have added other layers to define cell identity: single-cell resolved open chromatin epigenomic landscapes (scATAC-seq and scRNA/ATAC-seq; i.e. Multiome) as well as protein (CITE-seq), adaptive repertoire (single-cell B and T cell receptor sequencing; i.e. scBCR-seq and scTCR-seq) and spatial transcriptomics (ST) profiles.

The HCATonsilData package aims to provide programmatic and modular access to the datasets of the different modalities and cell types of the tonsil atlas. HCATonsilData also documents how the dataset was generated and archived in different repositories, from raw fastq files to processed datasets. It also explains in detail the cell- and sample-level metadata, and allows users to traceback the annotation of each cell type through detailed Glossary.

2 Installation

HCATonsilData is available in BioConductor and can be installed as follows:

if (!require("BiocManager", quietly = TRUE))


Load the necessary packages in R:


3 Overview of the dataset

We obtained a total of 17 human tonsils, which covered three age groups: children (n=6, 3-5 years), young adults (n=8, 19-35 years), and old adults (n=3, 56-65 years). We collected them in a discovery cohort (n=10), which we used to cluster and annotate cell types; and a validation cohort (n=7), which we used to validate the presence, annotation and markers of the discovered cell types. The following table corresponds to the donor-level metadata:


  format = "markdown",
  caption = "Donor Metadata",
  align = "c"
) |> kable_styling(full_width = FALSE)
Table 1: Table 2: Donor Metadata
donor_id hospital sex age age_group cause_for_tonsillectomy cohort_type comments
BCLL-2-T BCLL-2-T Clinic male 65 old_adult tonsil removed during surgery for benign squamous pharyngeal papillomatosis discovery NA
BCLL-6-T BCLL-6-T Clinic male 35 young_adult sleep apnea discovery NA
BCLL-8-T BCLL-8-T CIMA male 4 child tonsillitis discovery NA
BCLL-9-T BCLL-9-T CIMA male 5 child tonsillitis discovery NA
BCLL-10-T BCLL-10-T CIMA male 3 child tonsillitis discovery NA
BCLL-11-T BCLL-11-T CIMA female 5 child tonsillitis discovery NA
BCLL-12-T BCLL-12-T CIMA female 3 child tonsillitis discovery NA
BCLL-13-T BCLL-13-T CIMA female 5 child tonsillitis discovery NA
BCLL-14-T BCLL-14-T CIMA male 26 young_adult sleep apnea discovery NA
BCLL-15-T BCLL-15-T CIMA male 33 young_adult sleep apnea discovery NA
BCLL-20-T BCLL-20-T Newcastle male 23 young_adult tonsillitis validation original id: TIP01
BCLL-21-T BCLL-21-T Newcastle female 19 young_adult tonsillitis validation original id: TIP02
BCLL-22-T BCLL-22-T Newcastle female 22 young_adult tonsillitis validation original id: TIP03
BCLL-24-T BCLL-24-T Clinic male 63 old_adult tonsil removed during surgery for superficial squamous carcinoma of the laryngeal vocal cord validation NA
BCLL-25-T BCLL-25-T Clinic female 25 young_adult tonsillitis validation NA
BCLL-26-T BCLL-26-T Clinic male 56 old_adult sleep apnea validation NA
BCLL-28-T BCLL-28-T Newcastle male 28 young_adult tonsillitis validation original id: TIP04

These tonsil samples were processed with different data modalities:

The following heatmap informs about which samples were sequenced with which technology and cohort type: