The epimutacionsData
package is a repository of datasets for the
epimutacions
package. It includes 2 datasets to use as an example:
The following code explains how to access to the data:
library(ExperimentHub)
eh <- ExperimentHub()
query(eh, c("epimutacionsData"))
## ExperimentHub with 3 records
## # snapshotDate(): 2024-10-24
## # $dataprovider: GEO, Illumina 450k array
## # $species: Homo sapiens
## # $rdataclass: RGChannelSet, GenomicRatioSet, GRanges
## # additional mcols(): taxonomyid, genome, description,
## # coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags,
## # rdatapath, sourceurl, sourcetype
## # retrieve records with, e.g., 'object[["EH6690"]]'
##
## title
## EH6690 | Control and case samples
## EH6691 | Reference panel
## EH6692 | Candidate epimutations
In Illumina 450K array (Reproducibility 2012), probes are unequally distributed along the genome, limiting the number of regions that can fulfil the requirements to be considered an epimutation. So, we have computed a dataset containing the regions that are candidates to become an epimutation.
To define the candidate epimutations,
we relied on the clustering from bumphunter (Jaffe et al. 2012).
We defined a primary dataset with all the CpGs from the Illumina 450K array.
Then, we run bumphunter and selected those regions with at least 3 CpGs.
As a result, we found 40408 candidate epimutations
which are available in the candRegsGR
dataset.
candRegsGR <- eh[["EH6692"]]
The package includes an RGChannelSet
class reference panel
(reference_panel
)
which contains 22 whole cord blood samples from
healthy children born via caesarian from
the GSE127824 cohort (Gervin et al. 2019).
The reference panel can be found in EH6691
record of the eh
object:
reference_panel <- eh[["EH6691"]]
The methy
dataset includes 51 DNA methylation profiling
of whole blood samples. 48 controls from GSE104812 (Shi et al. 2018) cohort
and 3 cases from GSE97362 (Butcher et al. 2017).
it is a GenomicRatioSet
class object.
methy <- eh[["EH6690"]]
The IDAT files contain raw microarray intensities of 4 case samples
from GSE131350 cohort.
The files are located on the external data of epimutacionsData
package:
library(minfi)
baseDir <- system.file("extdata", package = "epimutacionsData")
targets <- read.metharray.sheet(baseDir)
sessionInfo()
## R Under development (unstable) (2024-10-21 r87258)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.1 LTS
##
## Matrix products: default
## BLAS: /home/biocbuild/bbs-3.21-bioc/R/lib/libRblas.so
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_GB LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: America/New_York
## tzcode source: system (glibc)
##
## attached base packages:
## [1] parallel stats4 stats graphics grDevices utils datasets
## [8] methods base
##
## other attached packages:
## [1] minfi_1.53.0 bumphunter_1.49.0
## [3] locfit_1.5-9.10 iterators_1.0.14
## [5] foreach_1.5.2 Biostrings_2.75.0
## [7] XVector_0.47.0 SummarizedExperiment_1.37.0
## [9] Biobase_2.67.0 MatrixGenerics_1.19.0
## [11] matrixStats_1.4.1 GenomicRanges_1.59.0
## [13] GenomeInfoDb_1.43.0 IRanges_2.41.0
## [15] S4Vectors_0.45.0 epimutacionsData_1.11.0
## [17] ExperimentHub_2.15.0 AnnotationHub_3.15.0
## [19] BiocFileCache_2.15.0 dbplyr_2.5.0
## [21] BiocGenerics_0.53.1 generics_0.1.3
## [23] BiocStyle_2.35.0
##
## loaded via a namespace (and not attached):
## [1] RColorBrewer_1.1-3 jsonlite_1.8.9
## [3] magrittr_2.0.3 GenomicFeatures_1.59.0
## [5] rmarkdown_2.29 BiocIO_1.17.0
## [7] zlibbioc_1.53.0 vctrs_0.6.5
## [9] multtest_2.63.0 memoise_2.0.1
## [11] Rsamtools_2.23.0 DelayedMatrixStats_1.29.0
## [13] RCurl_1.98-1.16 askpass_1.2.1
## [15] htmltools_0.5.8.1 S4Arrays_1.7.1
## [17] curl_5.2.3 Rhdf5lib_1.29.0
## [19] SparseArray_1.7.0 rhdf5_2.51.0
## [21] sass_0.4.9 nor1mix_1.3-3
## [23] bslib_0.8.0 plyr_1.8.9
## [25] cachem_1.1.0 GenomicAlignments_1.43.0
## [27] mime_0.12 lifecycle_1.0.4
## [29] pkgconfig_2.0.3 Matrix_1.7-1
## [31] R6_2.5.1 fastmap_1.2.0
## [33] GenomeInfoDbData_1.2.13 digest_0.6.37
## [35] siggenes_1.81.0 reshape_0.8.9
## [37] AnnotationDbi_1.69.0 RSQLite_2.3.7
## [39] base64_2.0.2 filelock_1.0.3
## [41] fansi_1.0.6 httr_1.4.7
## [43] abind_1.4-8 compiler_4.5.0
## [45] beanplot_1.3.1 rngtools_1.5.2
## [47] bit64_4.5.2 withr_3.0.2
## [49] BiocParallel_1.41.0 DBI_1.2.3
## [51] HDF5Array_1.35.1 MASS_7.3-61
## [53] openssl_2.2.2 rappdirs_0.3.3
## [55] DelayedArray_0.33.1 rjson_0.2.23
## [57] tools_4.5.0 rentrez_1.2.3
## [59] quadprog_1.5-8 glue_1.8.0
## [61] restfulr_0.0.15 nlme_3.1-166
## [63] rhdf5filters_1.19.0 grid_4.5.0
## [65] tzdb_0.4.0 preprocessCore_1.69.0
## [67] tidyr_1.3.1 hms_1.1.3
## [69] data.table_1.16.2 xml2_1.3.6
## [71] utf8_1.2.4 BiocVersion_3.21.1
## [73] pillar_1.9.0 limma_3.63.1
## [75] genefilter_1.89.0 splines_4.5.0
## [77] dplyr_1.1.4 lattice_0.22-6
## [79] survival_3.7-0 rtracklayer_1.67.0
## [81] bit_4.5.0 GEOquery_2.75.0
## [83] annotate_1.85.0 tidyselect_1.2.1
## [85] knitr_1.48 bookdown_0.41
## [87] xfun_0.49 scrime_1.3.5
## [89] statmod_1.5.0 UCSC.utils_1.3.0
## [91] yaml_2.3.10 evaluate_1.0.1
## [93] codetools_0.2-20 tibble_3.2.1
## [95] BiocManager_1.30.25 cli_3.6.3
## [97] xtable_1.8-4 jquerylib_0.1.4
## [99] Rcpp_1.0.13-1 png_0.1-8
## [101] XML_3.99-0.17 readr_2.1.5
## [103] blob_1.2.4 mclust_6.1.1
## [105] doRNG_1.8.6 sparseMatrixStats_1.19.0
## [107] bitops_1.0-9 illuminaio_0.49.0
## [109] purrr_1.0.2 crayon_1.5.3
## [111] rlang_1.1.4 KEGGREST_1.47.0