findSites {sitePath}R Documentation

Finding sites with variation

Description

Single nucleotide polymorphism (SNP) in the whole package refers to variation of amino acid. findSNPsite will try to find SNP in the multiple sequence alignment. A reference sequence and gap character may be specified to number the site. This is irrelevant to the intended analysis but might be helpful to evaluate the performance of fixationSites.

After finding the lineagePath of a phylogenetic tree, fixationSites uses the result to find those sites that show fixation on some, if not all, of the lineages. Parallel evolution is relatively common in RNA virus. There is chance that some site be fixed in one lineage but does not show fixation because of different sequence context.

After finding the lineagePath of a phylogenetic tree, multiFixationSites uses the result to find those sites that show multiple fixations on some, if not all, of the lineages.

Usage

SNPsites(tree, reference = NULL, gapChar = "-", minSNP = NULL)

## S3 method for class 'lineagePath'
fixationSites(paths, reference = NULL,
  gapChar = "-", tolerance = 0.01, minEffectiveSize = NULL,
  extendedSearch = TRUE, ...)

## S3 method for class 'lineagePath'
multiFixationSites(paths, reference = NULL,
  gapChar = "-", minEffectiveSize = NULL, extendedSearch = TRUE, ...)

Arguments

tree

The return from addMSA function

reference

Name of reference for site numbering. The name has to be one of the sequences' name. The default uses the intrinsic alignment numbering

gapChar

The character to indicate gap. The numbering will skip the gapChar for the reference sequence.

minSNP

Minimum number of amino acid variation to be a SNP

paths

a lineagePath object returned from lineagePath function

tolerance

A vector of two integers to specify maximum amino acid variation before/after mutation. Otherwise the mutation will not be counted into the return. If more than one number is given, the ancestral takes the first and descendant takes the second as the maximum. If only given one number, it's the maximum for both ancestral and descendant. The default is 0.01

minEffectiveSize

A vector of two integers to specifiy minimum tree tips involved before/after mutation. Otherwise the mutation will not be counted into the return. If more than one number is given, the ancestral takes the first and descendant takes the second as the minimum. If only given one number, it's the minimum for both ancestral and descendant.

extendedSearch

Whether to extend the search. The terminal of each lineagePath is a cluster of tips. To look for the fixation mutation in the cluster, the common ancestral node of farthest tips (at least two) will be the new terminal search point.

...

further arguments passed to or from other methods.

Value

SNPsite returns a list of qualified SNP site

fixationSites returns a list of mutations with names of the tips involved. The name of each list element is the discovered mutation. A mutation has two vectors of tip names: 'from' before the fixation and 'to' after the fixation.

multiFixationSites returns sites with multiple fixations.

Examples

data("zikv_tree_reduced")
data("zikv_align_reduced")
tree <- addMSA(zikv_tree_reduced, alignment = zikv_align_reduced)
SNPsites(tree)
fixationSites(
    lineagePath(tree),
    tolerance = c(1, 1),
    minEffectiveSize = c(50, 50)
)
data(h3n2_tree_reduced)
data(h3n2_align_reduced)
tree <- addMSA(h3n2_tree_reduced, alignment = h3n2_align_reduced)
multiFixationSites(lineagePath(tree))

[Package sitePath version 1.0.3 Index]