treeAndLeaf()and adding legends
TreeAndLeaf is an R-based package for better visualization of dendrograms and phylogenetic trees. The package changes the way a dendrogram is viewed. Through the use of the igraph format and the package RedeR, the nodes are rearranged and the hierarchical relations are kept intact, resulting in an image that is easier to read and can be enhanced with additional layers of information.
The classical dendrogram is a limited format in two ways. Firstly, it only displays one type of information, which is the hierarchical relation between the data. Secondly, it is limited by its size, the larger the database, the less readable it becomes. The TreeAndLeaf enhances space distribution because it uses all directions, allowing for an improved visualization and a better image for publications. The package RedeR, used for plotting in this package, uses a force-based relaxation algorithm that helps nodes in avoiding overlaps. By implementing RedeR and the igraph format, the package allows for customization of the dendrogram inserting multiple layers of information to be represented by edge widths and colors, nodes colors, nodes sizes, line color, etc. The package also includes a fast formatting option for quick and exploratory analysis usage. Therefore, the package is designed to make plotting dendrograms more useful, less confusing and more productive. The workflow while using this package is depicted from Figure 1.
Figure 1. A brief representation of what TreeAndLeaf functions are capable of. (A,B) The dendrogram in A was used to construct the graph representation shown in B. (C) Workflow summary. The main input data consists of a distance matrix, which is used to generate a dendrogram. The TreeAndLeaf package transforms the dendrogram into a graph representation.
This document intends to guide you through the basics and give you ideas of how to use the functions to their full potential. Although TreeAndLeaf was created for systems biology application, it is not at all limited to this use.
This section provides a quick and basic example using the R built-in dataframe
First, the packages necessary to the analysis are loaded.
library(TreeAndLeaf) library(RedeR) #> ***This is RedeR 1.38.0! For a quick start, please type 'vignette('RedeR')'. #> Supporting information is available at Genome Biology 13:R29, 2012, #> (doi:10.1186/gb-2012-13-4-r29). library(RColorBrewer)
As stated above,
USArrests is a dataframe readily available in R.
To know more about the info shown in this dataframe, use
To use TreeAndLeaf functions to their full potential, it is recommended that
your dataframe has rownames set before making the dendrogram, like this one has.
dim(USArrests) #>  50 4 head(USArrests) #> Murder Assault UrbanPop Rape #> Alabama 13.2 236 58 21.2 #> Alaska 10.0 263 48 44.5 #> Arizona 8.1 294 80 31.0 #> Arkansas 8.8 190 50 19.5 #> California 9.0 276 91 40.6 #> Colorado 7.9 204 78 38.7
In order to build a dendrogram, you need to have a distance matrix of the
observations. For example, the default “euclidean distance” method of
dist() can be used to generate a distance matrix, and then use the “average” method of
hclust() to create a dendrogram.
hc <- hclust(dist(USArrests), "ave") plot(hc)