1 Available datasets

The TENxVisiumData package provides an R/Bioconductor resource for Visium spatial gene expression datasets by 10X Genomics. The package currently includes 13 datasets from 23 samples across two organisms (human and mouse) and 13 tissues:

A list of currently available datasets can be obtained using the ExperimentHub interface:

library(ExperimentHub)
eh <- ExperimentHub()
(q <- query(eh, "TENxVisium"))
## ExperimentHub with 26 records
## # snapshotDate(): 2023-04-24
## # $dataprovider: 10X Genomics
## # $species: Homo sapiens, Mus musculus
## # $rdataclass: SpatialExperiment
## # additional mcols(): taxonomyid, genome, description,
## #   coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags,
## #   rdatapath, sourceurl, sourcetype 
## # retrieve records with, e.g., 'object[["EH6695"]]' 
## 
##            title                            
##   EH6695 | HumanBreastCancerIDC             
##   EH6696 | HumanBreastCancerILC             
##   EH6697 | HumanCerebellum                  
##   EH6698 | HumanColorectalCancer            
##   EH6699 | HumanGlioblastoma                
##   ...      ...                              
##   EH6739 | HumanSpinalCord_v3.13            
##   EH6740 | MouseBrainCoronal_v3.13          
##   EH6741 | MouseBrainSagittalPosterior_v3.13
##   EH6742 | MouseBrainSagittalAnterior_v3.13 
##   EH6743 | MouseKidneyCoronal_v3.13

2 Loading the data

To retrieve a dataset, we can use a dataset’s corresponding named function <id>(), where <id> should correspond to one a valid dataset identifier (see ?TENxVisiumData). E.g.:

library(TENxVisiumData)
spe <- HumanHeart()

Alternatively, data can loaded directly from Bioconductor’s ExerimentHub as follows. First, we initialize a hub instance and store the complete list of records in a variable eh. Using query(), we then identify any records made available by the TENxVisiumData package, as well as their accession IDs (EH1234). Finally, we can load the data into R via eh[[id]], where id corresponds to the data entry’s identifier we’d like to load. E.g.:

library(ExperimentHub)
eh <- ExperimentHub()        # initialize hub instance
q <- query(eh, "TENxVisium") # retrieve 'TENxVisiumData' records
id <- q$ah_id[1]             # specify dataset ID to load
spe <- eh[[id]]              # load specified dataset

3 Data representation

Each dataset is provided as a SpatialExperiment (SPE), which extends the SingleCellExperiment (SCE) class with features specific to spatially resolved data:

spe
## class: SpatialExperiment 
## dim: 36601 7785 
## metadata(0):
## assays(1): counts
## rownames(36601): ENSG00000243485 ENSG00000237613 ... ENSG00000278817
##   ENSG00000277196
## rowData names(1): symbol
## colnames(7785): AAACAAGTATCTCCCA-1 AAACACCAATAACTGC-1 ...
##   TTGTTTGTATTACACG-1 TTGTTTGTGTAAATTC-1
## colData names(1): sample_id
## reducedDimNames(0):
## mainExpName: NULL
## altExpNames(0):
## spatialCoords names(2) : pxl_col_in_fullres pxl_row_in_fullres
## imgData names(4): sample_id image_id data scaleFactor

For details on the SPE class, we refer to the package’s vignette. Briefly, the SPE harbors the following data in addition to that stored in a SCE:

spatialCoords; a numeric matrix of spatial coordinates, stored inside the object’s int_colData:

head(spatialCoords(spe))
##                    pxl_col_in_fullres pxl_row_in_fullres
## AAACAAGTATCTCCCA-1              15937              17428
## AAACACCAATAACTGC-1              18054               6092
## AAACAGAGCGACTCCT-1               7383              16351
## AAACAGGGTCTATATT-1              15202               5278
## AAACAGTGTTCCTGGG-1              21386               9363
## AAACATTTCCCGGATT-1              18549              16740

spatialData; a DFrame of spatially-related sample metadata, stored as part of the object’s colData. This colData subset is in turn determined by the int_metadata field spatialDataNames:

head(spatialData(spe))
## DataFrame with 6 rows and 0 columns

imgData; a DFrame containing image-related data, stored inside the int_metadata:

imgData(spe)
## DataFrame with 2 rows and 4 columns
##               sample_id    image_id   data scaleFactor
##             <character> <character> <list>   <numeric>
## 1 HumanBreastCancerIDC1      lowres   ####   0.0247525
## 2 HumanBreastCancerIDC2      lowres   ####   0.0247525

Datasets with multiple sections are consolidated into a single SPE with colData field sample_id indicating each spot’s sample of origin. E.g.:

spe <- MouseBrainSagittalAnterior()
table(spe$sample_id)
## 
## MouseBrainSagittalAnterior1 MouseBrainSagittalAnterior2 
##                        2695                        2825

Datasets of targeted analyses are provided as a nested SPE, with whole transcriptome measurements as primary data, and those obtained from targeted panels as altExps. E.g.:

spe <- HumanOvarianCancer()
altExpNames(spe)
## [1] "TargetedImmunology" "TargetedPanCancer"

Session information

sessionInfo()
## R version 4.3.0 RC (2023-04-13 r84269)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 22.04.2 LTS
## 
## Matrix products: default
## BLAS:   /home/biocbuild/bbs-3.17-bioc/R/lib/libRblas.so 
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_GB              LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: America/New_York
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats4    stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] TENxVisiumData_1.8.0        SpatialExperiment_1.10.0   
##  [3] SingleCellExperiment_1.22.0 SummarizedExperiment_1.30.0
##  [5] Biobase_2.60.0              GenomicRanges_1.52.0       
##  [7] GenomeInfoDb_1.36.0         IRanges_2.34.0             
##  [9] S4Vectors_0.38.0            MatrixGenerics_1.12.0      
## [11] matrixStats_0.63.0          ExperimentHub_2.8.0        
## [13] AnnotationHub_3.8.0         BiocFileCache_2.8.0        
## [15] dbplyr_2.3.2                BiocGenerics_0.46.0        
## [17] BiocStyle_2.28.0           
## 
## loaded via a namespace (and not attached):
##  [1] DBI_1.1.3                     bitops_1.0-7                 
##  [3] rlang_1.1.0                   magrittr_2.0.3               
##  [5] compiler_4.3.0                RSQLite_2.3.1                
##  [7] DelayedMatrixStats_1.22.0     png_0.1-8                    
##  [9] vctrs_0.6.2                   pkgconfig_2.0.3              
## [11] crayon_1.5.2                  fastmap_1.1.1                
## [13] magick_2.7.4                  XVector_0.40.0               
## [15] ellipsis_0.3.2                scuttle_1.10.0               
## [17] utf8_1.2.3                    promises_1.2.0.1             
## [19] rmarkdown_2.21                purrr_1.0.1                  
## [21] bit_4.0.5                     xfun_0.39                    
## [23] zlibbioc_1.46.0               cachem_1.0.7                 
## [25] beachmat_2.16.0               jsonlite_1.8.4               
## [27] blob_1.2.4                    later_1.3.0                  
## [29] rhdf5filters_1.12.0           DelayedArray_0.26.0          
## [31] Rhdf5lib_1.22.0               BiocParallel_1.34.0          
## [33] interactiveDisplayBase_1.38.0 parallel_4.3.0               
## [35] R6_2.5.1                      bslib_0.4.2                  
## [37] limma_3.56.0                  jquerylib_0.1.4              
## [39] Rcpp_1.0.10                   bookdown_0.33                
## [41] knitr_1.42                    R.utils_2.12.2               
## [43] httpuv_1.6.9                  Matrix_1.5-4                 
## [45] tidyselect_1.2.0              yaml_2.3.7                   
## [47] codetools_0.2-19              curl_5.0.0                   
## [49] lattice_0.21-8                tibble_3.2.1                 
## [51] shiny_1.7.4                   withr_2.5.0                  
## [53] KEGGREST_1.40.0               evaluate_0.20                
## [55] Biostrings_2.68.0             pillar_1.9.0                 
## [57] BiocManager_1.30.20           filelock_1.0.2               
## [59] generics_0.1.3                RCurl_1.98-1.12              
## [61] BiocVersion_3.17.1            sparseMatrixStats_1.12.0     
## [63] xtable_1.8-4                  glue_1.6.2                   
## [65] tools_4.3.0                   locfit_1.5-9.7               
## [67] rhdf5_2.44.0                  grid_4.3.0                   
## [69] DropletUtils_1.20.0           AnnotationDbi_1.62.0         
## [71] edgeR_3.42.0                  GenomeInfoDbData_1.2.10      
## [73] HDF5Array_1.28.0              cli_3.6.1                    
## [75] rappdirs_0.3.3                fansi_1.0.4                  
## [77] dplyr_1.1.2                   R.methodsS3_1.8.2            
## [79] sass_0.4.5                    digest_0.6.31                
## [81] dqrng_0.3.0                   rjson_0.2.21                 
## [83] memoise_2.0.1                 htmltools_0.5.5              
## [85] R.oo_1.25.0                   lifecycle_1.0.3              
## [87] httr_1.4.5                    mime_0.12                    
## [89] bit64_4.0.5