LearnNonCoding {DECIPHER}R Documentation

Learn a Non-Coding RNA Model

Description

Learns a compact representation of patterns representing a set of non-coding RNAs belonging to the same family.

Usage

LearnNonCoding(myXStringSet,
               threshold = 0.3,
               weight = NA,
               maxLoopLength = 500,
               maxPatterns = 20,
               scoreDependence = FALSE,
               structure = NULL,
               processors = 1)

Arguments

myXStringSet

A DNAStringSet or RNAStringSet object of aligned sequence representatives belonging to the same non-coding RNA family.

threshold

Numeric specifying the minimum relative frequency of patterns to consider during learning.

weight

Either a numeric vector of weights for each sequence, a single number implying equal weights, or NA (the default) to automatically calculate sequence weights based on myXStringSet.

maxLoopLength

Numeric giving the maximum length of conserved hairpin loops to consider.

maxPatterns

A numeric vector of length two specifying the maximum number of motifs and hairpins, respectively, or a single numeric giving the maximum for each.

scoreDependence

Logical determining whether to record a log-odds score for dependencies between patterns. The default (FALSE) is recommended for most non-coding RNA families.

structure

Either a character string providing the consensus secondary structure in dot bracket notation, a matrix of paired positions in the first two columns, or NULL (the default) to predict the consensus secondary structure with PredictDBN.

processors

The number of processors to use, or NULL to automatically detect and use all available processors.

Details

Non-coding RNAs belonging to the same family typically have conserved sequence motifs, secondary structure elements, and k-mer frequencies that can be used to identify members of the family. LearnNonCoding identifies these conserved patterns and determines which are best for identifying the non-coding RNA relative to a random sequence background. Sequence motifs and hairpins are defined relative to their distance from the start or end of the non-coding RNA, allowing the precise and rapid identification of the boundaries of any matches to the non-coding RNA in a genome.

Value

An object of class NonCoding.

Author(s)

Erik Wright eswright@pitt.edu

See Also

FindNonCoding, NonCoding-class

Examples

# import a family of non-coding RNAs
fas_path <- system.file("extdata",
	"IhtA.fas",
	package="DECIPHER")
rna <- readRNAStringSet(fas_path)
rna

# align the sequences
RNA <- AlignSeqs(rna)
RNA

y <- LearnNonCoding(RNA)
y
y[["motifs"]]
y[["hairpins"]]
head(y[["kmers"]])

[Package DECIPHER version 2.22.0 Index]