if (!require("BiocManager")) install.packages("BiocManager") BiocManager::install("BindingSiteFinder")
Most cellular processes are regulated by RNA-binding proteins (RBPs). Knowledge on their exact positioning can be obtained from individual-nucleotide resolution UV crosslinking and immunoprecipitation (iCLIP) experiments. In a recent publication we described a complete analysis workflow to detect RBP binding sites from iCLIP data. The workflow covers all essential steps, from quality control of sequencing reads, different peak calling options, to the downstream analysis and definition of binding sites. Pre-processing and peak calling steps relies on publicly available software, whereas the definition of the final binding sites follows a custom procedure implemented in R. This vignette explains how equally sized binding sites can be defined from a genome-wide iCLIP coverage.
The workflow described herein is based on our recently published complete iCLIP analysis pipeline (Busch et al. 2020). Thus, we expect the user to have preprocessed their iCLIP sequencing reads up to the point of the peak calling step. In brief, this includes basic processing of the sequencing reads, such as quality filtering, barcode handling, mapping and the generation of a single nucleotide crosslink file for all replicates under consideration. As we describe in our manuscript replicate .bam files may or may not be merged prior to peak calling, for which we suggest PureCLIP (Krakau, Richard, and Marsico 2017). For simplicity we address only the case where peak calling was based on the merge of all replicates.