profileAccuracyEstimate {ChIPanalyser} | R Documentation |
profileAccuracyEstimate
will compare the predicted ChIP-seq-like
profile to real ChIP-seq data and return a set of metrics describing how
accurate the predicted model is compared to real data.
profileAccuracyEstimate(LocusProfile, predictedProfile, occupancyProfileParameters = NULL)
LocusProfile |
|
predictedProfile |
|
occupancyProfileParameters |
|
The accuracy of the predicted profile may be estimated by measuring
corraltion, Mean Squared Error and theta (in house metric based on a
modified ratio of correlation over MSE) between predicted Profiles and
real ChIP-seq data. Actual ChIP-seq profiles should be normalised to a
base pair level (Enrichement divded by the width of the range for that
given score - the end result is a numeric vector of length equals to the
length of the locus in base pairs). It should be noted that if an
occupancyProfileParameters
object is not supplied,
then one will be created internally. However, we strongly advise to use
the same occupancyProfileParameters object used previously.
Returns a list of lists. Each element in the list represents a combination
of lambda (see ScalingFactorPWM
) and bound molecules
(see boundMolecules
) and the list within each element is
he list of Loci of interest. Finally, at the core of these lists is a
named vector containing correlation and MSE for the given Loci but also
meanCorr, meanMSE and meanTheta for all loci for a given combination of
Lambda and bound molecules.
Patrick C. N. Martin <pm16057@essex.ac.uk>
Zabet NR, Adryan B (2015) Estimating binding properties of transcription factors from genome-wide binding profiles. Nucleic Acids Res., 43, 84–94.
#Data extraction data(ChIPanalyserData) # path to Position Frequency Matrix PFM <- file.path(system.file("extdata",package="ChIPanalyser"),"BCDSlx.pfm") #As an example of genome, this example will run on the Drosophila genome if(!require("BSgenome.Dmelanogaster.UCSC.dm3", character.only = TRUE)){ source("https://bioconductor.org/biocLite.R") biocLite("BSgenome.Dmelanogaster.UCSC.dm3") } library(BSgenome.Dmelanogaster.UCSC.dm3) DNASequenceSet <- getSeq(BSgenome.Dmelanogaster.UCSC.dm3) #Building data objects GPP <- genomicProfileParameters(PFM=PFM,BPFrequency=DNASequenceSet) OPP <- occupancyProfileParameters() # Computing Genome Wide GenomeWide <- computeGenomeWidePWMScore(DNASequenceSet = DNASequenceSet, genomicProfileParameters = GPP) #Compute PWM Scores PWMScores <- computePWMScore(DNASequenceSet = DNASequenceSet, genomicProfileParameters = GenomeWide, setSequence = eveLocus, DNAAccessibility = Access) #Compute Occupnacy Occupancy <- computeOccupancy(AllSitesPWMScore = PWMScores, occupancyProfileParameters = OPP) #Compute ChIP profiles chipProfile <- computeChipProfile(setSequence = eveLocus, occupancy = Occupancy, occupancyProfileParameters = OPP) #Estimating accuracy estimate AccuracyEstimate <- profileAccuracyEstimate(LocusProfile = eveLocusChip, predictedProfile = chipProfile, occupancyProfileParameters = OPP)