odseq {odseq}R Documentation

Outlier detection in a multiple sequence alignment

Description

This function will first compute a distance metric among every sequence in the multiple alignment. Then it will bootstrap an average score of these distance to provide information on the distribution of scores, which is used to distinguish outlier sequences with a certain threshold

Usage

odseq(msa_object, distance_metric = "linear", B = 100, threshold = 0.025)

Arguments

msa_object

An object of formal class MsaAAMultipleAlignment, as provided by the msa package.

distance_metric

A string indicating the type of distance metric to be computed. Either 'linear' and 'affine' is supported at the moment.

B

Integer indicating the number of bootstrap replicates to be run. The higher the more robust the detection should be.

threshold

Float indicating the probability to be left at the right of the bootstrap scores distribution when computing outliers. This parameter may need some tuning depending on each specific problem

Value

Returns a logical vector, where TRUE indicates an outlier.

Author(s)

José Jiménez <jose@jimenezluna.com>

References

[1] OD-seq: outlier detection in multiple sequence alignments. Peter Jehl, Fabian Sievers and Desmond G. Higgins. BMC Bioinformatics. 2015.

See Also

odseq_unaligned

Examples

library(msa)
data(seqs)
al <- msa(seqs)
odseq(al, distance_metric = "affine", B = 1000, threshold = 0.025)

[Package odseq version 1.20.0 Index]