create_sequences {universalmotif}R Documentation

Create random sequences.

Description

Generate random sequences from any set of characters, represented as XStringSet objects.

Usage

create_sequences(alphabet = "DNA", seqnum = 100, seqlen = 100,
  monofreqs, difreqs, trifreqs, progress = FALSE, BP = FALSE)

Arguments

alphabet

character(1) One of c('DNA', 'RNA', 'AA'), or a string of characters to be used as the alphabet.

seqnum

numeric(1) Number of sequences to generate.

seqlen

numeric(1) Length of random sequences.

monofreqs

numeric Alphabet frequencies to use. If missing assumes uniform frequencies. Not used if difreq or trifreq are input.

difreqs

numeric Dinucleotide frequencies. DNA/RNA only. Must be a named numeric vector of length 16.

trifreqs

numeric Trinucleotide frequencies. DNA/RNA only. Must be a named numeric vector of length 64.

progress

logical(1) Show progress. Not recommended if BP = TRUE.

BP

logical(1) Allows the use of BiocParallel within create_sequences(). See BiocParallel::register() to change the default backend. Setting BP = TRUE is only recommended for large jobs (such as create_sequences(seqlen=100000,seqnum=100000)). Furthermore, the behaviour of progress = TRUE is changed if BP = TRUE; the default BiocParallel progress bar will be shown (which unfortunately is much less informative).

Value

XStringSet The returned sequences are unnamed.

Author(s)

Benjamin Jean-Marie Tremblay, b2tremblay@uwaterloo.ca

References

Pagès H, Aboyoun P, Gentleman R, DebRoy S (2018). Biostrings: Efficient manipulation of biological strings. R package version 2.48.0.

See Also

create_motif(), shuffle_sequences()

Examples

## create DNA sequences with slightly increased AT content:
sequences <- create_sequences(monofreqs = c(0.3, 0.2, 0.2, 0.3))
## create custom sequences:
sequences.QWER <- create_sequences("QWER")
## you can include non-alphabet characters are well, even spaces:
sequences.custom <- create_sequences("!@#$ ")


[Package universalmotif version 1.0.22 Index]