shuffle_sequences {universalmotif} | R Documentation |
Given a set of input sequences, shuffle the letters within those sequences with any k-let size.
shuffle_sequences(sequences, k = 1, method = "linear", leftovers = "asis", progress = FALSE, BP = FALSE)
sequences |
|
k |
|
method |
|
leftovers |
|
progress |
|
BP |
|
If method = 'markov'
, then the Markov model is used to
generate sequences which will maintain (on average) the k-let
frequencies. Please note that this method is not a 'true' shuffling, and
for short sequences (e.g. <100bp) this can result in slightly more
dissimilar sequences versus true shuffling. See
Fitch (1983) and
Altschul and Erickson (1985) for a discussion on the
topic.
If method = 'linear'
, then the input sequences are split linearly
every k
letters; for example, for k = 3
'ACAGATAGACCC' becomes
'ACA GAT AGA CCC'; afterwhich these 3
-lets are shuffled randomly. If
method = 'random'
, then k
-lets are picked from the sequence
completely randomly. This however can leave 'leftover' letters, where
lone letter islands smaller than k
are left. There are a few options
provided to deal with these: leftovers = 'asis'
will leave these
letter islands in place; leftovers = 'first'
will place these
letters at the beginning of the sequence; leftovers = 'split'
will place half of the leftovers at the beginning and end of the
sequence; leftovers = 'discard'
simply gets rid of the leftovers.
Do note however, that the method
parameter is only relevant for k > 1
.
For this, a simple sample
call is performed.
XStringSet
The input sequences will be returned with
identical names and lengths.
Benjamin Jean-Marie Tremblay, b2tremblay@uwaterloo.ca
Altschul SF, Erickson BW (1985). “Significance of Nucleotide Sequence Alignments: A Method for Random Sequence Permutation That Preserves Dinucleotide and Codon Usage.” Molecular Biology and Evolution, 2, 526-538.
Fitch WM (1983). “Random sequences.” Journal of Molecular Biology, 163, 171-176.
create_sequences()
, scan_sequences()
, enrich_motifs()
,
shuffle_motifs()
sequences <- create_sequences() sequences.shuffled <- shuffle_sequences(sequences, k = 2)