combinatorialDist {motifcounter} | R Documentation |
This function approxmiates the distribution of the number of motif hits. To this end, it sums over all combinations of obtaining k hits in a random sequence of a given length using an efficient dynamic programming algorithm.
combinatorialDist(seqlen, overlap)
seqlen |
Integer-valued vector that defines the lengths of the
individual sequences. For a given DNAStringSet,
this information can be retrieved using |
overlap |
An Overlap object. |
This function is an alternative to compoundPoissonDist
which requires fixed-length sequences and currently
only supports the computation of
the distribution of the number of hits when both
DNA strands are scanned for motif hits.
List containing
Distribution of the number of hits
# Load sequences seqfile = system.file("extdata", "seq.fasta", package = "motifcounter") seqs = Biostrings::readDNAStringSet(seqfile) # Load motif motiffile = system.file("extdata", "x31.tab", package = "motifcounter") motif = t(as.matrix(read.table(motiffile))) # Load background model bg = readBackground(seqs, 1) # Compute overlap probabilities op = motifcounter:::probOverlapHit(motif, bg, singlestranded = FALSE) # Use 2 sequences of length 100 bp each seqlen = rep(100, 2) # Computes the combinatorial distribution of the number of motif hits dist = motifcounter:::combinatorialDist(seqlen, op)