Contents

library(TreeAndLeaf)
library(RedeR)
#> ***This is RedeR 1.36.0! For a quick start, please type 'vignette('RedeR')'.
#>    Supporting information is available at Genome Biology 13:R29, 2012,
#>    (doi:10.1186/gb-2012-13-4-r29).
library(RColorBrewer)
library(igraph)
#> 
#> Attaching package: 'igraph'
#> The following objects are masked from 'package:stats':
#> 
#>     decompose, spectrum
#> The following object is masked from 'package:base':
#> 
#>     union
library(ape)
#> 
#> Attaching package: 'ape'
#> The following objects are masked from 'package:igraph':
#> 
#>     edges, mst, ring

1 Overview

TreeAndLeaf is an R-based package for better visualization of dendrograms and phylogenetic trees. The package changes the way a dendrogram is viewed. Through the use of the igraph format and the package RedeR, the nodes are rearranged and the hierarchical relations are kept intact, resulting in an image that is easier to read and can be enhanced with additional layers of information.

The classical dendrogram is a limited format in two ways. Firstly, it only displays one type of information, which is the hierarchical relation between the data. Secondly, it is limited by its size, the larger the database, the less readable it becomes. The TreeAndLeaf enhances space distribution because it uses all directions, allowing for an improved visualization and a better image for publications. The package RedeR, used for plotting in this package, uses a force-based relaxation algorithm that helps nodes in avoiding overlaps. By implementing RedeR and the igraph format, the package allows for customization of the dendrogram inserting multiple layers of information to be represented by edge widths and colors, nodes colors, nodes sizes, line color, etc. The package also includes a fast formatting option for quick and exploratory analysis usage. Therefore, the package is designed to make plotting dendrograms more useful, less confusing and more productive. The workflow while using this package is depicted from Figure 1.

Figure 1. A brief representation of what TreeAndLeaf functions are capable of. (A,B) The dendrogram in A was used to construct the graph representation shown in B. (C) Workflow summary. The main input data consists of a distance matrix, which is used to generate a dendrogram. The TreeAndLeaf package transforms the dendrogram into a graph representation.

This document intends to guide you through the basics and give you ideas of how to use the functions to their full potential. Although TreeAndLeaf was created for systems biology application, it is not at all limited to this use.

2 USArrests - a small dendrogram example

This section provides a quick and basic example using the R built-in dataframe USArrests, shown below. To know more about the info shown in this dataframe, use ?USArrests. To use TreeAndLeaf functions to their full potential, it is recommended that your dataframe has rownames set before making the dendrogram, like this one has.

dim(USArrests)
#> [1] 50  4
head(USArrests)
#>            Murder Assault UrbanPop Rape
#> Alabama      13.2     236       58 21.2
#> Alaska       10.0     263       48 44.5
#> Arizona       8.1     294       80 31.0
#> Arkansas      8.8     190       50 19.5
#> California    9.0     276       91 40.6
#> Colorado      7.9     204       78 38.7

2.1 Building a dendrogram using R hclust()

In order to build a dendrogram, you need to have a distance matrix of the observations. For example, the default “euclidean distance” method of dist() can be used to generate one, and then use the “average” method of hclust() to create a dendrogram.

hc <- hclust(dist(USArrests), "ave")
plot(hc)

2.2 Converting your hclust object to an igraph object

This is a rather simple but important step. Since TreeAndLeaf and RedeR work with igraph objects, a function is provided to convert an hclust dendrogram into an igraph. For that, simply follow use hclust2igraph().

gg <- hclust2igraph(hc)

2.3 Formating the igraph for better visualization in RedeR

There is a quick formatting option in TreeAndLeaf package by using the function formatTree(), which is a theme function used to standardize node sizes and colors. This is an important step because the tree will have leaf nodes (the ones representing your observations) and non-leaf nodes (the ones representing bifurcations of the dendrogram), and this function makes the last ones invisible to achieve the desired appearance and proper relaxation. A description of available themes can be consulted at ?formatTree.

gg <- formatTree(gg = gg, theme = 5)

Now, the tree-and-leaf diagram is ready to be shown in RedeR with treeAndLeaf(), or you can have layers of information added to it, as shown below.

2.4 Inserting additional layers of information

RedeR offers a set of functions to manipulate igraph attributes according to the parameters the application reads.

First, att.mapv() is used to insert the dataframe inside the igraph object and make it available for setting node attributes. In this step, it is crucial that the refcol points to a column with the same content as hc$labels.

In this case, refcol = 0 indicates the rownames of the dataframe.

gg <- att.mapv(g = gg, dat = USArrests, refcol = 0)

Now that the info is available, att.setv() changes the igraph attributes. The package RColorBrewer can be used to generate a palette for reference. Try ?addGraph to see the options of igraph attributes RedeR can read.

pal <- brewer.pal(9, "Reds")
gg <- att.setv(g = gg, from = "Murder", to = "nodeColor",
                        cols = pal, nquant = 5)
gg <- att.setv(g = gg, from = "UrbanPop", to = "nodeSize",
                        xlim = c(50, 150, 1), nquant = 5)

2.5 Calling the RedeR interface

With the igraph ready to be visualized, you need to invoke RedeR interface. This might take some seconds.

rdp <- RedPort()
calld(rdp)
resetd(rdp)

2.6 Calling treeAndLeaf() and adding legends

This is TreeAndLeaf’s main function. It will read your igraph object, generate the tree layout, plot it in RedeR interface and use functions to enhance appeal and distribution.

treeAndLeaf(obj = rdp,
            gg = gg)

Adding legends is optional. When you call for att.setv() and inform column names for nodeColor and nodeSize, it will automatically generate a RedeR readable legend, which can be plotted using the code below.

addLegend.color(obj = rdp,
                        gg,
                        title = "Murder Rate",
                        position = "right")

addLegend.size(obj = rdp,
                        gg,
                        title = "Urban Population Size",
                        position = "bottomright")

2.7 Making manual adjustments

At this stage the image produced needs small adjustments to solve the residual edge crossings. It is possible to just click and drag a node to adjust it while the relaxation algorithm is still running.