1 Introduction

Post-transcriptional modifications can be found abundantly in rRNA and tRNA and can be detected classically via several strategies. However, difficulties arise if the identity and the position of the modified nucleotides is to be determined at the same time. Classically, a primer extension, a form of reverse transcription (RT), would allow certain modifications to be accessed by blocks during the RT or changes in the cDNA sequences. Other modification would need to be selectively treated by chemical reactions to influence the outcome of the reverse transcription.

With the increased availability of high throughput sequencing, these classical methods were adapted to high throughput methods allowing more RNA molecules to be accessed at the same time. With these advances post-transcriptional modifications were also detected on mRNA. Among these high throughput techniques are for example Pseudo-Seq (Carlile et al. 2014), RiboMethSeq (Birkedal et al. 2015) and AlkAnilineSeq (Marchand et al. 2018) each able to detect a specific type of modification from footprints in RNA-Seq data prepared with the selected methods.

Since similar pattern can be observed from some of these techniques, overlaps of the bioinformatical pipeline already are and will become more frequent with new emerging sequencing techniques.

RNAmodR implements classes and a workflow to detect post-transcriptional RNA modifications in high throughput sequencing data. It is easily adaptable to new methods and can help during the phase of initial method development as well as more complex screenings.

Briefly, from the SequenceData, specific subclasses are derived for accessing specific aspects of aligned reads, e.g. 5’-end positions or pileup data. With this a Modifier class can be used to detect specific patterns for individual types of modifications. The SequenceData classes can be shared by different Modifier classes allowing easy adaptation to new methods.

## snapshotDate(): 2020-10-02

library(rtracklayer)
library(Rsamtools)
library(GenomicFeatures)
library(RNAmodR.Data)
library(RNAmodR)

1.1 SequenceData

Each SequenceData object is created with a named character vector, which can be coerced to a BamFileList, or named BamFileList. The names must be either “treated” or “control” describing the condition the data file belongs to. Multiple files can be given per condition and are used as replicates.

annotation <- GFF3File(RNAmodR.Data.example.gff3())
sequences <- RNAmodR.Data.example.fasta()
files <- c(Treated = RNAmodR.Data.example.bam.1(),
           Treated = RNAmodR.Data.example.bam.2(),
           Treated = RNAmodR.Data.example.bam.3())

For annotation and sequences several input are accepted. annotation can be a GRangesList, a GFF3File or a TxDb object. Internally, a GFF3File is converted to a TxDb object and a GRangesList is retrieved using the exonsBy function.

seqdata <- End5SequenceData(files, annotation = annotation, 
                            sequences = sequences)

## Import genomic features from the file as a GRanges object ... OK
## Prepare the 'metadata' data frame ... OK
## Make the TxDb object ... OK
## Loading 5'-end position data from BAM files ... OK

seqdata

## End5SequenceData with 60 elements containing 3 data columns and 3 metadata columns
## - Data columns:
##  end5.treated.1 end5.treated.2 end5.treated.3
##       <integer>      <integer>      <integer>
## -  Seqinfo object with 84 sequences from an unspecified genome; no seqlengths:

SequenceData extends from a CompressedSplitDataFrameList and contains the data per transcript alongside the annotation information and the sequence. The additional data stored within the SequenceData can be accessed by several functions.

names(seqdata) # matches the transcript names as returned by a TxDb object
colnames(seqdata) # returns a CharacterList of all column names
bamfiles(seqdata)
ranges(seqdata) # generate from a TxDb object
sequences(seqdata)
seqinfo(seqdata)

Currently the following SequenceData classes are implemented:

End5SequenceData
End3SequenceData
EndSequenceData
ProtectedEndSequenceData
CoverageSequenceData
PileupSequenceData
NormEnd5SequenceData
NormEnd3SequenceData

The data types and names of the columns are different for most of the SequenceData classes. As a naming convenction a descriptor is combined with the condition as defined in the files input and the replicate number. For more details please have a look at the man pages, e.g. ?End5SequenceData.

SequenceData objects can be subset like a CompressedSplitDataFrameList. Elements are returned as a SequenceDataFrame dependent of the type of SequenceData used. For each SequenceData class a matching SequenceDataFrame is implemented.

seqdata[1]

## End5SequenceData with 1 elements containing 3 data columns and 3 metadata columns
## - Data columns:
##  end5.treated.1 end5.treated.2 end5.treated.3
##       <integer>      <integer>      <integer>
## -  Seqinfo object with 84 sequences from an unspecified genome; no seqlengths:

sdf <- seqdata[[1]]
sdf

## End5SequenceDataFrame with 1649 rows and 3 columns
##      end5.treated.1 end5.treated.2 end5.treated.3
##           <integer>      <integer>      <integer>
## 1                 1              4              0
## 2                 0              2              0
## 3                 0              0              0
## 4                 0              0              0
## 5                 0              0              0
## ...             ...            ...            ...
## 1645              0              0              0
## 1646              0              0              0
## 1647              0              0              0
## 1648              0              0              0
## 1649              0              0              0
## 
## containing a GRanges object with 1 range and 3 metadata columns:
##             seqnames    ranges strand |   exon_id   exon_name exon_rank
##                <Rle> <IRanges>  <Rle> | <integer> <character> <integer>
##   [1] Q0020_15S_RRNA    1-1649      + |         1       Q0020         1
##   -------
##   seqinfo: 60 sequences from an unspecified genome; no seqlengths
## 
## and a 1649-letter RNAString object
## seq: GUAAAAAAUUUAUAAGAAUAUGAUGUUGGUUCAGAU...UGCGGUGGGCUUAUAAAUAUCUUAAAUAUUCUUACA

The SequenceDataFrame objects retains some accessor functions from the SequenceData class.

names(sdf) # this returns the columns names of the data
ranges(sdf)
sequences(sdf)

Subsetting of a SequenceDataFrame returns a SequenceDataFrame or DataFrame, depending on whether it is subset by a column or row, respectively. The drop argument is ignored for column subsetting.

sdf[,1:2]

## End5SequenceDataFrame with 1649 rows and 2 columns
##      end5.treated.1 end5.treated.2
##           <integer>      <integer>
## 1                 1              4
## 2                 0              2
## 3                 0              0
## 4                 0              0
## 5                 0              0
## ...             ...            ...
## 1645              0              0
## 1646              0              0
## 1647              0              0
## 1648              0              0
## 1649              0              0
## 
## containing a GRanges object with 1 range and 3 metadata columns:
##             seqnames    ranges strand |   exon_id   exon_name exon_rank
##                <Rle> <IRanges>  <Rle> | <integer> <character> <integer>
##   [1] Q0020_15S_RRNA    1-1649      + |         1       Q0020         1
##   -------
##   seqinfo: 60 sequences from an unspecified genome; no seqlengths
## 
## and a 1649-letter RNAString object
## seq: GUAAAAAAUUUAUAAGAAUAUGAUGUUGGUUCAGAU...UGCGGUGGGCUUAUAAAUAUCUUAAAUAUUCUUACA

sdf[1:3,]

## DataFrame with 3 rows and 3 columns
##   end5.treated.1 end5.treated.2 end5.treated.3
##        <integer>      <integer>      <integer>
## 1              1              4              0
## 2              0              2              0
## 3              0              0              0

1.2 Modifier

Whereas, the SequenceData classes are used to hold the data, Modifier classes are used to detect certain features within high throughput sequencing data to assign the presence of specific modifications for an established pattern. The Modifier class (and its nucleotide specific subclasses RNAModifier and DNAModifier) is virtual and can be addapted to individual methods. For example mapped reads can be analyzed using the ModInosine class to reveal the presence of I by detecting a A to G conversion in normal RNA-Seq data. Therefore, ModInosine inherits from RNAModifier.

To fix the data processing and detection strategy, for each type of sequencing method a Modifier class can be developed alongside to detect modifications. For more information on how to develop such a class and potentially a new corresponding SequenceData class, please have a look at the vignette for creating a new analysis.

For now three Modifier classes are available:

ModInosine
ModRiboMethSeq from the RNAmodR.RiboMethSeq package
ModAlkAnilineSeq from the RNAmodR.AlkAnilineSeq package

Modifier objects can use and wrap multiple SequenceData objects as elements of a SequenceDataSet class. The elements of this class are different types of SequenceData, which are required by the specific Modifier class. However, they are required to contain data for the same annotation and sequence data.

Modifier objects are created with the same arguments as SequenceData objects and will start loading the necessary SequenceData objects from these. In addition they will automatically start to calculate any additional scores (aggregation) and then start to search for modifications, if the optional argument find.mod is not set to FALSE.

mi <- ModInosine(files, annotation = annotation, sequences = sequences)

## Import genomic features from the file as a GRanges object ... OK
## Prepare the 'metadata' data frame ... OK
## Make the TxDb object ... OK
## Loading Pileup data from BAM files ... OK
## Aggregating data and calculating scores ... Starting to search for 'Inosine' ... done.

(Hint: If you use an artificial genome, name the chromosomes chr1-chrN. It will make some things easier for subsequent visualization, which relies on the Gviz package)

Since the Modifier class wraps a SequenceData object the accessors to data contained within work similarly to the SequenceData accessors described above. What type of conditions the Modifier class expects/supports is usually described in the man pages of the Modifier class.

names(mi) # matches the transcript names as returned by a TxDb object
bamfiles(mi)
ranges(mi) # generated from a TxDb object
sequences(mi)
seqinfo(mi)
sequenceData(mi) # returns the SequenceData

1.2.1 Settings

The behavior of a Modifier class can be fine tuned using settings. The settings() function is a getter/setter for arguments used in the analysis and my differ between different Modifier classes depending on the particular strategy and whether they are implemented as flexible settings.

settings(mi)

## $minCoverage
## [1] 10
## 
## $minReplicate
## [1] 1
## 
## $find.mod
## [1] TRUE
## 
## $minScore
## [1] 0.4

settings(mi,"minScore")

## [1] 0.4

settings(mi) <- list(minScore = 0.5)
settings(mi,"minScore")

## [1] 0.5

1.3 ModifierSet

Each Modifier object is able to represent one sample set with multiple replicates of data. To easily compare multiple sample sets the ModifierSet class is implemented.

The ModifierSet object is created from a named list of named character vectors or BamFileList objects. Each element in the list is a sample type with a corresponding name. Each entry in the character vector/BamFileList is a replicate (Alternatively a ModifierSet can also be created from a list of Modifier objects, if they are of the same type).

sequences <- RNAmodR.Data.example.AAS.fasta()
annotation <- GFF3File(RNAmodR.Data.example.AAS.gff3())
files <- list("SampleSet1" = c(treated = RNAmodR.Data.example.wt.1(),
                               treated = RNAmodR.Data.example.wt.2(),
                               treated = RNAmodR.Data.example.wt.3()),
              "SampleSet2" = c(treated = RNAmodR.Data.example.bud23.1(),
                               treated = RNAmodR.Data.example.bud23.2()),
              "SampleSet3" = c(treated = RNAmodR.Data.example.trm8.1(),
                               treated = RNAmodR.Data.example.trm8.2()))

msi <- ModSetInosine(files, annotation = annotation, sequences = sequences)

## Import genomic features from the file as a GRanges object ... OK
## Prepare the 'metadata' data frame ... OK
## Make the TxDb object ... OK

The creation of the ModifierSet will itself trigger the creation of a Modifier object each containing data from one sample set. This step is parallelized using the BiocParallel package. If a Modifier class itself uses parallel computing for its analysis, it is switched off unless internalBP = TRUE is set. In this case each Modifier object is created in sequence, allowing parallel computing during the creation of each object.

names(msi)

## [1] "SampleSet1" "SampleSet2" "SampleSet3"

msi[[1]]

## A ModInosine object containing PileupSequenceData with 11 elements.
## | Input files:
##    - treated: /home/biocbuild/.cache/ExperimentHub/1151668c6595_2544
##    - treated: /home/biocbuild/.cache/ExperimentHub/2a2730c64fd6_2546
##    - treated: /home/biocbuild/.cache/ExperimentHub/21fb6d60bb23_2548
## | Nucleotide - Modification type(s):  RNA  -  I 
## | Modifications found: yes (6) 
## | Settings:
##   minCoverage minReplicate  find.mod  minScore
##     <integer>    <integer> <logical> <numeric>
##            10            1      TRUE       0.4

Again accessors remain mostly the same as described above for the Modifier class returning a list of results, one element for each Modifier object.

bamfiles(msi)
ranges(msi) # generate from a TxDb object
sequences(msi)
seqinfo(msi)

2 Analysis of detected modifications

Found modifications can be retrieved from a Modifier or ModifierSet object via the modifications() function. The function returns a GRanges or GRangesList object, respectively, which contains the coordinates of the modifications with respect to the genome used. For example if a transcript starts at position 100 and contains a modified nucleotide at position 50 of the transcript, the returned coordinate will 150.

mod <- modifications(msi)
mod[[1]]

## GRanges object with 6 ranges and 5 metadata columns:
##       seqnames    ranges strand |         mod      source        type     score
##          <Rle> <IRanges>  <Rle> | <character> <character> <character> <numeric>
##   [1]     chr2        34      + |           I     RNAmodR      RNAMOD  0.900932
##   [2]     chr4        35      + |           I     RNAmodR      RNAMOD  0.899622
##   [3]     chr6        34      + |           I     RNAmodR      RNAMOD  0.984035
##   [4]     chr7        67      + |           I     RNAmodR      RNAMOD  0.934553
##   [5]     chr9         7      + |           I     RNAmodR      RNAMOD  0.709758
##   [6]    chr11        35      + |           I     RNAmodR      RNAMOD  0.874027
##            Parent
##       <character>
##   [1]           2
##   [2]           4
##   [3]           6
##   [4]           7
##   [5]           9
##   [6]          11
##   -------
##   seqinfo: 11 sequences from an unspecified genome; no seqlengths

To retrieve the coordinates with respect to the transcript boundaries, use the optional argument perTranscript = TRUE. In the example provided here, this will yield the same coordinates, since a custom genome was used for mapping of the example, which does not contain transcripts on the negative strand and per transcript chromosomes.

mod <- modifications(msi, perTranscript = TRUE)
mod[[1]]

## GRanges object with 6 ranges and 5 metadata columns:
##       seqnames    ranges strand |         mod      source        type     score
##          <Rle> <IRanges>  <Rle> | <character> <character> <character> <numeric>
##   [1]     chr2        34      * |           I     RNAmodR      RNAMOD  0.900932
##   [2]     chr4        35      * |           I     RNAmodR      RNAMOD  0.899622
##   [3]     chr6        34      * |           I     RNAmodR      RNAMOD  0.984035
##   [4]     chr7        67      * |           I     RNAmodR      RNAMOD  0.934553
##   [5]     chr9         7      * |           I     RNAmodR      RNAMOD  0.709758
##   [6]    chr11        35      * |           I     RNAmodR      RNAMOD  0.874027
##            Parent
##       <character>
##   [1]           2
##   [2]           4
##   [3]           6
##   [4]           7
##   [5]           9
##   [6]          11
##   -------
##   seqinfo: 11 sequences from an unspecified genome; no seqlengths

2.1 Compairing results

To compare results between samples, a ModifierSet as well as a definition of positions to compare are required. To construct a set of positions, we will use the intersection of all modifications found as an example.

mod <- modifications(msi)
coord <- unique(unlist(mod))
coord$score <- NULL
coord$sd <- NULL
compareByCoord(msi,coord)

## DataFrame with 6 rows and 6 columns
##   SampleSet1 SampleSet2 SampleSet3    names positions         mod
##    <numeric>  <numeric>  <numeric> <factor>  <factor> <character>
## 1   0.900932   0.998134   0.953651       2         34           I
## 2   0.899622   0.856241   0.976928       4         35           I
## 3   0.984035   0.992012   0.993128       6         34           I
## 4   0.934553   0.942905   0.943773       7         67           I
## 5   0.709758   0.766484   0.681451       9         7            I
## 6   0.874027   0.971474   0.954782       11        35           I

The result can also be plotted using plotCompareByCoord, which accepts an optional argument alias to allow transcript ids to be converted to other identifiers. For this step it is probably helpful to construct a TxDb object right at the beginning and use it for constructing the Modifier/ModifierSet object as the annotation argument.

txdb <- makeTxDbFromGFF(annotation)

## Import genomic features from the file as a GRanges object ... OK
## Prepare the 'metadata' data frame ... OK
## Make the TxDb object ... OK

alias <- data.frame(tx_id = names(id2name(txdb)),
                    name = id2name(txdb))

plotCompareByCoord(msi, coord, alias = alias)

Figure 1: Heatmap for identified Inosine positions

Additionally, the order of sample sets can be adjusted, normalized to any of the sample sets and the numbering of positions shown per transcript.

plotCompareByCoord(msi[c(3,1,2)], coord, alias = alias, normalize = "SampleSet3",
                   perTranscript = TRUE)

Figure 2: Heatmap for identified Inosine positions with normalized scores

The calculated scores and data can be visualized along the transcripts or chunks of the transcript. With the optional argument showSequenceData the plotting of the sequence data in addition to the score data can be triggered by setting it to TRUE.

plotData(msi, "2", from = 10L, to = 45L, alias = alias) # showSequenceData = FALSE

Figure 3: Scores along a transcript containing a A to G conversion indicating the presence of Inosine

plotData(msi[1:2], "2", from = 10L, to = 45L, showSequenceData = TRUE, alias = alias)

Scores along a transcript containing a A to G conversion indicating the presence of Inosine. This figure includes the detailed pileup sequence data.

Figure 4: Scores along a transcript containing a A to G conversion indicating the presence of Inosine
This figure includes the detailed pileup sequence data.

3 Performance measurements

Since the detection of modifications from high throughput sequencing data relies usually on thresholds for calling modifications, there is considerable interest in analyzing the performance of the method based on scores chosen and available samples. To analyse the performance, the function plotROC() is implemented, which is a wrapper around the functionality of the ROCR package (Sing et al. 2005)(#References).

For the example data used in this vignette, the information gained is rather limited and the following figure should be regarded just as a proof of concept. In addition, the use of found modifications sites as an input for plotROC is strongly discouraged, since defeats the purpose of the test. Therefore, please regard this aspect of the next chunk as proof of concept as well.

plotROC(msi, coord)

Figure 5: TPR vs. FPR plot

Please have a look at ?plotROC for additional details. Most of the functionality from the ROCR package is available via additional arguments, thus the output of plotROC can be heavily customized.

4 Additional informations

To have a look at metadata of reads for an analysis with RNAmodR the function stats() can be used. It can be used with a bunch of object types: SequenceData, SequenceDataList, SequenceDataSet, Modifier or ModifierSet. For SequenceData* objects, the BamFile to be analyzed must be provided as well, which automatically done for Modifier* objects. For more details have a look at ?stats.

stats <- stats(msi)
stats

## List of length 3
## names(3): SampleSet1 SampleSet2 SampleSet3

stats[["SampleSet1"]]

## DataFrameList of length 3
## names(3): treated treated treated

stats[["SampleSet1"]][["treated"]]

## DataFrame with 11 rows and 6 columns
##    seqnames seqlength    mapped  unmapped          used     used_distro
##    <factor> <integer> <numeric> <numeric> <IntegerList>          <List>
## 1     chr1       1800    197050         0        159782 83,1252,860,...
## 2     chr2         85      5863         0          2459     2,16,16,...
## 3     chr3         76     76905         0         63497 35,478,4106,...
## 4     chr4         77      8299         0          6554     6,27,36,...
## 5     chr5         74     11758         0          8818  520,105,93,...
## 6     chr6         76      3504         0          2427     1,38,45,...
## 7     chr7         75     14556         0         12243  1532,46,12,...
## 8     chr8         75    144293         0        143068    14,44,48,...
## 9     chr9         75     13790         0          9753     1,49,43,...
## 10    chr10        85     19861         0         17729    35,21,10,...
## 11    chr11        77     10532         0          9086   53,92,185,...

The data returned by stats() is a DataFrame, which can be wrapped as a DataFrameList or a SimpleList depending on the input type. Analysis of the data must be manually done and can be used to produced output like the following plot for distribution of lengths for reads analyzed.

Figure 6: Distribution of lengths for reads used in the analysis

5 Further development

The development of RNAmodR will continue. General ascpects of the analysis workflow will be addressed in the RNAmodR package, whereas additional classes for new sequencing techniques targeted at detecting post-transcriptional will be wrapped in individual packages. This will allow general improvements to propagate upstream, but not hinder individual requirements of each detection strategy.

For an example have a look at the RNAmodR.RiboMethSeq and RNAmodR.AlkAnilineSeq packages.

Features, which might be added in the future:

interaction with our packages for data aggregation (for example meta gene aggregation)
interaction with our packages for downstream analysis for visualization

We welcome contributions of any sort.

6 Sessioninfo

sessionInfo()

## R version 4.0.3 (2020-10-10)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 18.04.5 LTS
## 
## Matrix products: default
## BLAS:   /home/biocbuild/bbs-3.12-bioc/R/lib/libRblas.so
## LAPACK: /home/biocbuild/bbs-3.12-bioc/R/lib/libRlapack.so
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] parallel  stats4    stats     graphics  grDevices utils     datasets 
## [8] methods   base     
## 
## other attached packages:
##  [1] RNAmodR_1.4.2            Modstrings_1.6.0         RNAmodR.Data_1.4.0      
##  [4] ExperimentHubData_1.16.0 AnnotationHubData_1.20.0 futile.logger_1.4.3     
##  [7] ExperimentHub_1.16.0     AnnotationHub_2.22.0     BiocFileCache_1.14.0    
## [10] dbplyr_2.0.0             GenomicFeatures_1.42.1   AnnotationDbi_1.52.0    
## [13] Biobase_2.50.0           Rsamtools_2.6.0          Biostrings_2.58.0       
## [16] XVector_0.30.0           rtracklayer_1.50.0       GenomicRanges_1.42.0    
## [19] GenomeInfoDb_1.26.2      IRanges_2.24.1           S4Vectors_0.28.1        
## [22] BiocGenerics_0.36.0      BiocStyle_2.18.1        
## 
## loaded via a namespace (and not attached):
##   [1] backports_1.2.1               Hmisc_4.4-2                  
##   [3] plyr_1.8.6                    lazyeval_0.2.2               
##   [5] splines_4.0.3                 BiocParallel_1.24.1          
##   [7] ggplot2_3.3.2                 digest_0.6.27                
##   [9] ensembldb_2.14.0              htmltools_0.5.0              
##  [11] magick_2.5.2                  magrittr_2.0.1               
##  [13] checkmate_2.0.0               memoise_1.1.0                
##  [15] BSgenome_1.58.0               cluster_2.1.0                
##  [17] ROCR_1.0-11                   matrixStats_0.57.0           
##  [19] askpass_1.1                   prettyunits_1.1.1            
##  [21] jpeg_0.1-8.1                  colorspace_2.0-0             
##  [23] blob_1.2.1                    rappdirs_0.3.1               
##  [25] xfun_0.19                     dplyr_1.0.2                  
##  [27] crayon_1.3.4                  RCurl_1.98-1.2               
##  [29] jsonlite_1.7.2                graph_1.68.0                 
##  [31] VariantAnnotation_1.36.0      survival_3.2-7               
##  [33] glue_1.4.2                    gtable_0.3.0                 
##  [35] zlibbioc_1.36.0               DelayedArray_0.16.0          
##  [37] scales_1.1.1                  futile.options_1.0.1         
##  [39] DBI_1.1.0                     Rcpp_1.0.5                   
##  [41] xtable_1.8-4                  progress_1.2.2               
##  [43] htmlTable_2.1.0               foreign_0.8-80               
##  [45] bit_4.0.4                     OrganismDbi_1.32.0           
##  [47] Formula_1.2-4                 AnnotationForge_1.32.0       
##  [49] htmlwidgets_1.5.3             httr_1.4.2                   
##  [51] getopt_1.20.3                 RColorBrewer_1.1-2           
##  [53] ellipsis_0.3.1                farver_2.0.3                 
##  [55] pkgconfig_2.0.3               XML_3.99-0.5                 
##  [57] Gviz_1.34.0                   nnet_7.3-14                  
##  [59] labeling_0.4.2                reshape2_1.4.4               
##  [61] tidyselect_1.1.0              rlang_0.4.9                  
##  [63] later_1.1.0.1                 munsell_0.5.0                
##  [65] biocViews_1.58.1              BiocVersion_3.12.0           
##  [67] tools_4.0.3                   generics_0.1.0               
##  [69] RSQLite_2.2.1                 evaluate_0.14                
##  [71] stringr_1.4.0                 fastmap_1.0.1                
##  [73] yaml_2.2.1                    knitr_1.30                   
##  [75] bit64_4.0.5                   purrr_0.3.4                  
##  [77] AnnotationFilter_1.14.0       RBGL_1.66.0                  
##  [79] mime_0.9                      formatR_1.7                  
##  [81] xml2_1.3.2                    biomaRt_2.46.0               
##  [83] compiler_4.0.3                rstudioapi_0.13              
##  [85] curl_4.3                      png_0.1-7                    
##  [87] interactiveDisplayBase_1.28.0 tibble_3.0.4                 
##  [89] stringi_1.5.3                 highr_0.8                    
##  [91] lattice_0.20-41               ProtGenerics_1.22.0          
##  [93] Matrix_1.2-18                 vctrs_0.3.5                  
##  [95] stringdist_0.9.6.3            pillar_1.4.7                 
##  [97] BiocCheck_1.26.0              lifecycle_0.2.0              
##  [99] RUnit_0.4.32                  optparse_1.6.6               
## [101] BiocManager_1.30.10           data.table_1.13.4            
## [103] bitops_1.0-6                  colorRamps_2.3               
## [105] httpuv_1.5.4                  R6_2.5.0                     
## [107] latticeExtra_0.6-29           bookdown_0.21                
## [109] promises_1.1.1                gridExtra_2.3                
## [111] codetools_0.2-18              dichromat_2.0-0              
## [113] lambda.r_1.2.4                assertthat_0.2.1             
## [115] SummarizedExperiment_1.20.0   openssl_1.4.3                
## [117] withr_2.3.0                   rBiopaxParser_2.30.0         
## [119] GenomicAlignments_1.26.0      GenomeInfoDbData_1.2.4       
## [121] hms_0.5.3                     grid_4.0.3                   
## [123] rpart_4.1-15                  rmarkdown_2.5                
## [125] MatrixGenerics_1.2.0          biovizBase_1.38.0            
## [127] shiny_1.5.0                   base64enc_0.1-3

References

Birkedal, Ulf, Mikkel Christensen-Dalsgaard, Nicolai Krogh, Radhakrishnan Sabarinathan, Jan Gorodkin, and Henrik Nielsen. 2015. “Profiling of Ribose Methylations in Rna by High-Throughput Sequencing.” Angewandte Chemie (International Ed. In English) 54 (2): 451–55. https://doi.org/10.1002/anie.201408362.

Carlile, Thomas M., Maria F. Rojas-Duran, Boris Zinshteyn, Hakyung Shin, Kristen M. Bartoli, and Wendy V. Gilbert. 2014. “Pseudouridine Profiling Reveals Regulated mRNA Pseudouridylation in Yeast and Human Cells.” Nature 515 (7525): 143–46.

Marchand, Virginie, Lilia Ayadi, Felix G. M. Ernst, Jasmin Hertler, Valérie Bourguignon-Igel, Adeline Galvanin, Annika Kotter, Mark Helm, Denis L. J. Lafontaine, and Yuri Motorin. 2018. “AlkAniline-Seq: Profiling of m7G and m3C Rna Modifications at Single Nucleotide Resolution.” Angewandte Chemie International Edition 57 (51): 16785–90. https://doi.org/10.1002/anie.201810946.

Sing, Tobias, Oliver Sander, Niko Beerenwinkel, and Thomas Lengauer. 2005. “ROCR: Visualizing Classifier Performance in R.” Bioinformatics (Oxford, England) 21 (20): 3940–1. https://doi.org/10.1093/bioinformatics/bti623.

RNAmodR: analyzing high throughput sequencing data for post-transcriptional RNA modification footprints

2020-12-13

Package