Introduction

In this vignette, we will cover some of the possible sources for publicly available single-cell data, how to format it for and process it with MCIA and how to run a basic analysis in order to annotate the data and use the results as metadata for exploring the decomposition results.

Installation

To install the development version from GitHub:

# devel version

# install.packages("devtools")
devtools::install_github("Muunraker/nipalsMCIA", ref = "devel",
                         force = TRUE, build_vignettes = TRUE) # devel version

To install the released Bioconductor version:

# release version
if (!require("BiocManager", quietly = TRUE))
  install.packages("BiocManager")

BiocManager::install("nipalsMCIA")
# note that the TENxPBMCData package is not included in this list as you may
# decide to pull data from another source or use our provided objects

library(BiocFileCache)
library(dplyr)
library(ggplot2)
library(ggpubr)
library(nipalsMCIA)
library(piggyback)
library(Seurat)

# NIPALS starts with a random vector
set.seed(42)
# if you would like to save any of the data loaded/created locally
path_data <- file.path("..", "data")

To manage the data used in this vignette, we will be using BiocFileCache:

# as suggested in: https://bioconductor.org/packages/release/bioc/vignettes/BiocFileCache/inst/doc/BiocFileCache.html#cache-to-manage-package-data
# since we only use the cache in this vignette, we don't need wrapper functions

# initialize the cache in a temporary directory
bfc <- BiocFileCache::BiocFileCache()
  
# set up or load the unique resource ids
rid <- BiocFileCache::bfcquery(bfc, query = "vignette_2", "rname")$rid
# rid <- bfcrid(bfc) # we only have one rname, so you could also just use this

# if the data is not in the cache, download and add it
if (length(rid) == 0) {
  # specify `tag = ` to use a different release other than latest
  urls <-
    piggyback::pb_download_url(repo = "Muunraker/nipalsMCIA", tag = "latest")
  
  # you could add all of the urls to rid at once, but it would throw this error
  # when it comes to downloading them:
  # "Error in parse_url(url) : length(url) == 1 is not TRUE"
  for (url in urls) {
    # the rids are numbered in the order they are added e.g. BFC1, BFC2, ...
    message(paste("Downloading file", url))
    rid <- names(BiocFileCache::bfcadd(bfc, rname = "vignette_2", url))
    
    # check if the cached data needs updating and download it if needed
    if (!isFALSE(BiocFileCache::bfcneedsupdate(bfc, rid))) {
      # this will overwrite existing files without asking
      BiocFileCache::bfcdownload(bfc, rid, ask = FALSE)
    }
  }
}

# return the paths to the cached files
# BiocFileCache::bfcrpath(bfc, rid = rid)

The temporary directory being used is:

# you can also run tools::R_user_dir("BiocFileCache", which = "cache")
BiocFileCache::getBFCOption("CACHE")
## [1] "/home/biocbuild/.cache/R/BiocFileCache"

Vignette Pipeline

The lines for the data_sc_sce.Rda are dotted because we do not provide this object below; however, the code describing how to generate it is still included.