findPeakHeight {htSeqTools} | R Documentation |
Uses False Discovery Rate to estimate the optimal value of the minHeight parameter in the call to enrichedPeaks. False Discovery Rate is computed by swapping IP and Input samples and calculating the ratio of 'false' peaks identified for a given set of minHeight values.
findPeakHeight(regions, sample1, sample2, hmin=5, hmax=200, myfdr=0.01, gridSize=25, space, mc.cores=1) plotminHeight(x,...)
regions |
|
sample1 |
|
sample2 |
Same for sample 2 (Control sample). |
hmin |
Minimum minHeight value to be considered for FDR estimation. Defaults to 5. |
hmax |
Maximum minHeight value to be considered for FDR estimation. Max coverage difference between sample 1 and sample 2 will be also calculated, and the minimum of the two will be used as hmax. This is done to avoid a skewed distribution of minHeight values to test for FDR estimation. |
myfdr |
Desired FDR cut-off. |
gridSize |
Number of intermediate steps of minHeight threshold to consider between hmin and hmax. Since FDR and optimal minHeight estimation is done by actually performing peak calls, selecting a high value for gridSize can come at a big computational cost. Default value of 25 or close is recommended. |
space |
Character text giving the name of the space for the
|
mc.cores |
If |
x |
An object of class |
... |
Other graphical arguments passed to function |
Object of class list
with slots fdr, npeaks
indicating peak calling FDR and number of peaks identified in the IP
sample for each considered minHeight value and cut, opt
with
the desired FDR cut-off and the correponding minHeight value to be
used in the call to enrichedPeaks
.
signature(regions = "RangedData", sample1 = "RangedData",
sample2 = "RangedData")
sample1
indicates the start/end of
reads in sample 1 (IP Sample), and similarly for sample2
(Control sample). Only the subset of
regions
indicated by the argument space
will be used.
signature(regions = "RangedData", sample1 = "IRangesList",
sample2 = "IRangesList")
regions
contains the regions of
interest, sample1
and sample2
the reads in sample 1
and sample 2, respectively. names(sample1)
and
names(sample2)
must correspond to the space names used in regions
.
signature(regions = "RangedData", sample1 = "Rle",
sample2 = "Rle")
regions
contains the regions of
interest, sample1
and sample2
the coverage for the reads in sample 1
and sample 2, respectively. names(sample1)
and
names(sample2)
must correspond to the space names used in regions
.
enrichedPeaks
set.seed(1) st <- round(rnorm(1000,500,100)) strand <- rep(c('+','-'),each=500) space <- rep('chr1',length(st)) sample1 <- RangedData(IRanges(st,st+38),strand=strand,space=space) st <- round(runif(1000,1,1000)) sample2 <- RangedData(IRanges(st,st+38),strand=strand,space=space) #Find enriched regions and call peaks mappedreads <- c(sample1=nrow(sample1),sample2=nrow(sample2)) regions <- enrichedRegions(sample1,sample2,mappedreads=mappedreads,minReads=50) minHeight <- findPeakHeight(regions,sample1=sample1,sample2=sample2) plotminHeight(minHeight) peaks <- enrichedPeaks(regions,sample1=sample1,sample2=sample2,minHeight=minHeight$opt) peaks <- peaks[width(peaks)>10,] peaks #Compute coverage in peaks cover <- coverage(as(sample1,'GRanges')) coverinpeaks <- regionsCoverage(chr=seqnames(peaks),start=start(peaks),end=end(peaks),cover=cover) #Evaluate coverage in regular grid and plot #Can be helpful fo clustering of peak profiles coveringrid <- gridCoverage(coverinpeaks) coveringrid plot(coveringrid) #Standardize peak profiles dividing by max coverage stdcoveringrid <- stdGrid(coveringrid, colname='maxCov') stdcoveringrid plot(stdcoveringrid)