LACE 2.0 is a new release of the LACE R package. LACE 2.0 is capable of performing clonal evolution analyses for single-cell sequencing data including longitudinal experiments. LACE 2.0 allows to annotate variants and retrieve the relevant mutations interactively based on user-defined filtering criteria; it infers the maximum likelihood clonal tree, cell matrix attachments and false positive/negative rates using boolean matrix factorization. Furthermore, LACE 2.0 allows to investigate cancer clonal evolution under different experimental conditions and the occurrence of single mutations which can be queried via ensembl database.

0.1 Installation of LACE 2.0 R package

The package is available on GitHub and Bioconductor. LACE 2.0 requires R >= 4.2.0 and Bioconductor. To install Bioconductor run:

if (!require("BiocManager", quietly = TRUE))
    install.packages("BiocManager")

BiocManager::install("LACE")

LACE 2.0 uses Annovar and Samtools suite as back-ends for variant calling annotation and depth computation, respectively. Please refer to the next section to install them.

0.2 Installation of other required softwares

Annovar is a widely used variant calling software freely available upon registration to their website at https://annovar.openbioinformatics.org/en/latest/. The package contains Perl scripts and variant calling annotation reference databases for the human species. For other databases, please refer to their website. If the scripts are installed in binary search path, then LACE 2.0 will detect them automatically.

Perl (https://www.perl.org/) is required to run Annovar.

Samtools suite is a standard set of tools and libraries to handle SAM/BAM/BED file format and perform a variety of common operations on sequencing data. It is freely available at http://www.htslib.org/ and https://github.com/samtools/htslib. To install Samtools follow the instructions in their website.

0.2.0.1 For Windows users, we suggest the following guidelines:

  • Download MSYS2 or WSL

  • Download the Samtools source files from http://www.htslib.org/

  • The field db_home, in the Samtools source file etc/nsswitch.conf, should be changed to windows such that: db_home: windows

  • Install MSYS2/WSL (it is preferably to have MSYS2 in the “C:” path), and install the packages required by Samtools as stated in the INSTALL documentation file within the Samtool source folder

  • Inside a MSYS2/WSL shell, add the following directories to the variable PATH using the command: export PATH="/mingw64/bin/:/mingw64/:$PATH"

  • From the above MSYS2/WSL shell, follow the Samtools documentation to build and install the software

  • Change the Windows PATH variable in the System variables and add the following paths: C:\msys64\usr\bin, C:\msys64\usr, C:\msys64\mingw64\bin, C:\msys64\mingw64

  • We remind that Annovar is a Perl script, and .pl files need be associated to Perl executable.
    Eventually, from the Windows command prompt, users should be able to start Samtools using the command samtools and directly execute Perl scripts by calling their filenames.

0.3 Running LACE 2.0

To start LACE 2.0 user interface run:

library(LACE)
LACEview()

0.4 Using LACE 2.0

LACE 2.0 has been thought to be used on single cell sequencing data for which it is available variant calling data in standard VCF format and binary aligned data in standard BAM format.

The user is provided with an interface to initiate a project and to set filter thresholds after which annotation of variants, filtering of data and depth at variant sites are retrieved. Both annotation and depth derivation are computationally expensive steps. LACE 2.0 reduces possible re-computation by detecting parameter variations of the user interface and by comparison of the timestamps of interface state, inputs and outputs. Intermediary and final data are stored in the designated folders.

The operation is followed by the possibility for the user to select variants which are drivers of the understudied biological problem.

At this point, the inferential step is executed so that the most likelihood longitudinal clonal tree and clonal prevalences are retrieved together with the best set of false positive/negative rates among those provided by the user.

The results are displayed via an interactive interface.

0.5 Interface

The interface is divided in two parts interleaved by variant selection and inferential computation parts: the processing interface and the results interface.

0.5.1 Project creation

To begin the clonal analyses the user needs to create a project by choosing a folder and a meaningful name for the project (field 1.1 and 1.2).