############################################################################## ############################################################################## ### ### Running command: ### ### /home/biocbuild/R/R-4.3.1/bin/R CMD check --install=check:GenomicAlignments.install-out.txt --library=/home/biocbuild/R/R-4.3.1/site-library --no-vignettes --timings GenomicAlignments_1.38.0.tar.gz ### ############################################################################## ############################################################################## * using log directory ‘/home/biocbuild/bbs-3.18-bioc/meat/GenomicAlignments.Rcheck’ * using R version 4.3.1 (2023-06-16) * using platform: aarch64-unknown-linux-gnu (64-bit) * R was compiled by gcc (GCC) 10.3.1 GNU Fortran (GCC) 10.3.1 * running under: openEuler 22.03 (LTS-SP1) * using session charset: UTF-8 * using option ‘--no-vignettes’ * checking for file ‘GenomicAlignments/DESCRIPTION’ ... OK * this is package ‘GenomicAlignments’ version ‘1.38.0’ * package encoding: UTF-8 * checking package namespace information ... OK * checking package dependencies ... NOTE Depends: includes the non-default packages: 'BiocGenerics', 'S4Vectors', 'IRanges', 'GenomeInfoDb', 'GenomicRanges', 'SummarizedExperiment', 'Biostrings', 'Rsamtools' Adding so many packages to the search path is excessive and importing selectively is preferable. * checking if this is a source package ... OK * checking if there is a namespace ... OK * checking for hidden files and directories ... OK * checking for portable file names ... OK * checking for sufficient/correct file permissions ... OK * checking whether package ‘GenomicAlignments’ can be installed ... OK * used C compiler: ‘gcc (GCC) 10.3.1’ * checking installed package size ... OK * checking package directory ... OK * checking ‘build’ directory ... OK * checking DESCRIPTION meta-information ... NOTE Packages listed in more than one of Depends, Imports, Suggests, Enhances: ‘methods’ ‘BiocGenerics’ ‘S4Vectors’ ‘IRanges’ ‘GenomicRanges’ ‘Biostrings’ ‘Rsamtools’ A package should be listed in only one of these fields. * checking top-level files ... OK * checking for left-over files ... OK * checking index information ... OK * checking package subdirectories ... NOTE Problems with news in ‘NEWS’: Cannot process chunk/lines: No NEW FEATURES or SIGNIFICANT USER-VISIBLE CHANGES or BUG FIXES since Cannot process chunk/lines: version 1.18.0 Cannot process chunk/lines: No NEW FEATURES or SIGNIFICANT USER-VISIBLE CHANGES or BUG FIXES since Cannot process chunk/lines: version 1.16.0 Cannot process chunk/lines: The first version of GenomicAlignments was included in Bioconductor 2.14. Cannot process chunk/lines: The package was created from existing code in IRanges, ShortRead, Cannot process chunk/lines: Rsamtools and GenomicRanges. * checking R files for non-ASCII characters ... OK * checking R files for syntax errors ... OK * checking whether the package can be loaded ... OK * checking whether the package can be loaded with stated dependencies ... OK * checking whether the package can be unloaded cleanly ... OK * checking whether the namespace can be loaded with stated dependencies ... OK * checking whether the namespace can be unloaded cleanly ... OK * checking loading without being on the library search path ... OK * checking dependencies in R code ... NOTE ':::' calls which should be '::': ‘S4Vectors:::makeClassinfoRowForCompactPrinting’ ‘S4Vectors:::makePrettyMatrixForCompactPrinting’ See the note in ?`:::` about the use of this operator. Unexported objects imported by ':::' calls: ‘Rsamtools:::.BamViews_delegate’ ‘Rsamtools:::.findMateWithinGroups’ ‘Rsamtools:::.load_bamcols_from_scanBam_res’ See the note in ?`:::` about the use of this operator. * checking S3 generic/method consistency ... OK * checking replacement functions ... OK * checking foreign function calls ... OK * checking R code for possible problems ... OK * checking Rd files ... WARNING checkRd: (5) GAlignmentPairs-class.Rd:90-94: \item in \describe must have non-empty label checkRd: (5) GAlignmentPairs-class.Rd:102-133: \item in \describe must have non-empty label checkRd: (5) GAlignmentPairs-class.Rd:134-137: \item in \describe must have non-empty label checkRd: (5) GAlignmentPairs-class.Rd:138-143: \item in \describe must have non-empty label checkRd: (5) GAlignmentPairs-class.Rd:144-160: \item in \describe must have non-empty label checkRd: (5) GAlignmentPairs-class.Rd:161-172: \item in \describe must have non-empty label checkRd: (5) GAlignmentPairs-class.Rd:173-179: \item in \describe must have non-empty label checkRd: (5) GAlignmentPairs-class.Rd:180-183: \item in \describe must have non-empty label checkRd: (5) GAlignmentPairs-class.Rd:184-188: \item in \describe must have non-empty label checkRd: (5) GAlignmentPairs-class.Rd:189-193: \item in \describe must have non-empty label checkRd: (5) GAlignmentPairs-class.Rd:194-203: \item in \describe must have non-empty label checkRd: (5) GAlignmentPairs-class.Rd:204-210: \item in \describe must have non-empty label checkRd: (5) GAlignmentPairs-class.Rd:211-216: \item in \describe must have non-empty label checkRd: (5) GAlignmentPairs-class.Rd:217-222: \item in \describe must have non-empty label checkRd: (5) GAlignmentPairs-class.Rd:223-232: \item in \describe must have non-empty label checkRd: (5) GAlignmentPairs-class.Rd:240-244: \item in \describe must have non-empty label checkRd: (5) GAlignmentPairs-class.Rd:252-257: \item in \describe must have non-empty label checkRd: (5) GAlignmentPairs-class.Rd:258-264: \item in \describe must have non-empty label checkRd: (5) GAlignmentPairs-class.Rd:272-298: \item in \describe must have non-empty label checkRd: (5) GAlignmentPairs-class.Rd:299-328: \item in \describe must have non-empty label checkRd: (5) GAlignmentPairs-class.Rd:329-336: \item in \describe must have non-empty label checkRd: (5) GAlignmentPairs-class.Rd:337-340: \item in \describe must have non-empty label checkRd: (5) GAlignmentPairs-class.Rd:348-359: \item in \describe must have non-empty label checkRd: (5) GAlignments-class.Rd:148-154: \item in \describe must have non-empty label checkRd: (5) GAlignments-class.Rd:162-165: \item in \describe must have non-empty label checkRd: (5) GAlignments-class.Rd:166-171: \item in \describe must have non-empty label checkRd: (5) GAlignments-class.Rd:172-179: \item in \describe must have non-empty label checkRd: (5) GAlignments-class.Rd:180-183: \item in \describe must have non-empty label checkRd: (5) GAlignments-class.Rd:184-190: \item in \describe must have non-empty label checkRd: (5) GAlignments-class.Rd:191-195: \item in \describe must have non-empty label checkRd: (5) GAlignments-class.Rd:196-202: \item in \describe must have non-empty label checkRd: (5) GAlignments-class.Rd:203-213: \item in \describe must have non-empty label checkRd: (5) GAlignments-class.Rd:214-221: \item in \describe must have non-empty label checkRd: (5) GAlignments-class.Rd:222-227: \item in \describe must have non-empty label checkRd: (5) GAlignments-class.Rd:228-232: \item in \describe must have non-empty label checkRd: (5) GAlignments-class.Rd:233-241: \item in \describe must have non-empty label checkRd: (5) GAlignments-class.Rd:242-248: \item in \describe must have non-empty label checkRd: (5) GAlignments-class.Rd:249-254: \item in \describe must have non-empty label checkRd: (5) GAlignments-class.Rd:255-260: \item in \describe must have non-empty label checkRd: (5) GAlignments-class.Rd:261-270: \item in \describe must have non-empty label checkRd: (5) GAlignments-class.Rd:278-292: \item in \describe must have non-empty label checkRd: (5) GAlignments-class.Rd:293-335: \item in \describe must have non-empty label checkRd: (5) GAlignments-class.Rd:336-344: \item in \describe must have non-empty label checkRd: (5) GAlignments-class.Rd:351-357: \item in \describe must have non-empty label checkRd: (5) GAlignments-class.Rd:365-369: \item in \describe must have non-empty label checkRd: (5) GAlignments-class.Rd:375-381: \item in \describe must have non-empty label checkRd: (5) GAlignments-class.Rd:388-400: \item in \describe must have non-empty label checkRd: (5) GAlignmentsList-class.Rd:67-70: \item in \describe must have non-empty label checkRd: (5) GAlignmentsList-class.Rd:78-81: \item in \describe must have non-empty label checkRd: (5) GAlignmentsList-class.Rd:82-85: \item in \describe must have non-empty label checkRd: (5) GAlignmentsList-class.Rd:86-90: \item in \describe must have non-empty label checkRd: (5) GAlignmentsList-class.Rd:91-94: \item in \describe must have non-empty label checkRd: (5) GAlignmentsList-class.Rd:95-99: \item in \describe must have non-empty label checkRd: (5) GAlignmentsList-class.Rd:100-105: \item in \describe must have non-empty label checkRd: (5) GAlignmentsList-class.Rd:106-112: \item in \describe must have non-empty label checkRd: (5) GAlignmentsList-class.Rd:113-118: \item in \describe must have non-empty label checkRd: (5) GAlignmentsList-class.Rd:119-123: \item in \describe must have non-empty label checkRd: (5) GAlignmentsList-class.Rd:124-129: \item in \describe must have non-empty label checkRd: (5) GAlignmentsList-class.Rd:130-134: \item in \describe must have non-empty label checkRd: (5) GAlignmentsList-class.Rd:135-139: \item in \describe must have non-empty label checkRd: (5) GAlignmentsList-class.Rd:140-146: \item in \describe must have non-empty label checkRd: (5) GAlignmentsList-class.Rd:147-152: \item in \describe must have non-empty label checkRd: (5) GAlignmentsList-class.Rd:153-158: \item in \describe must have non-empty label checkRd: (5) GAlignmentsList-class.Rd:159-162: \item in \describe must have non-empty label checkRd: (5) GAlignmentsList-class.Rd:170-187: \item in \describe must have non-empty label checkRd: (5) GAlignmentsList-class.Rd:188-206: \item in \describe must have non-empty label checkRd: (5) GAlignmentsList-class.Rd:207-215: \item in \describe must have non-empty label checkRd: (5) GAlignmentsList-class.Rd:216-222: \item in \describe must have non-empty label checkRd: (5) GAlignmentsList-class.Rd:223-228: \item in \describe must have non-empty label checkRd: (5) GAlignmentsList-class.Rd:236-240: \item in \describe must have non-empty label checkRd: (5) GAlignmentsList-class.Rd:241-244: \item in \describe must have non-empty label checkRd: (5) GAlignmentsList-class.Rd:245-250: \item in \describe must have non-empty label checkRd: (5) GAlignmentsList-class.Rd:256-262: \item in \describe must have non-empty label checkRd: (5) GappedReads-class.Rd:40-44: \item in \describe must have non-empty label checkRd: (5) OverlapEncodings-class.Rd:133-137: \item in \describe must have non-empty label checkRd: (5) OverlapEncodings-class.Rd:138-155: \item in \describe must have non-empty label checkRd: (5) OverlapEncodings-class.Rd:156-182: \item in \describe must have non-empty label checkRd: (5) OverlapEncodings-class.Rd:183-185: \item in \describe must have non-empty label checkRd: (5) OverlapEncodings-class.Rd:186-192: \item in \describe must have non-empty label checkRd: (5) OverlapEncodings-class.Rd:200-204: \item in \describe must have non-empty label checkRd: (5) OverlapEncodings-class.Rd:213-227: \item in \describe must have non-empty label checkRd: (5) OverlapEncodings-class.Rd:228-235: \item in \describe must have non-empty label checkRd: (5) OverlapEncodings-class.Rd:236-245: \item in \describe must have non-empty label checkRd: (5) OverlapEncodings-class.Rd:246-261: \item in \describe must have non-empty label checkRd: (5) summarizeOverlaps-methods.Rd:226-246: \item in \describe must have non-empty label * checking Rd metadata ... OK * checking Rd cross-references ... OK * checking for missing documentation entries ... OK * checking for code/documentation mismatches ... OK * checking Rd \usage sections ... OK * checking Rd contents ... OK * checking for unstated dependencies in examples ... OK * checking line endings in C/C++/Fortran sources/headers ... OK * checking compiled code ... NOTE Note: information on .o files is not available * checking files in ‘vignettes’ ... OK * checking examples ... ERROR Running examples in ‘GenomicAlignments-Ex.R’ failed The error most likely occurred in: > base::assign(".ptime", proc.time(), pos = "CheckExEnv") > ### Name: readGAlignments > ### Title: Reading genomic alignments from a file > ### Aliases: readGAlignments readGAlignments,BamFile-method > ### readGAlignments,character-method readGAlignments,BamViews-method > ### readGAlignmentPairs readGAlignmentPairs,BamFile-method > ### readGAlignmentPairs,character-method readGAlignmentsList > ### readGAlignmentsList,BamFile-method > ### readGAlignmentsList,character-method readGappedReads > ### readGappedReads,BamFile-method readGappedReads,character-method > ### Keywords: manip > > ### ** Examples > > ## --------------------------------------------------------------------- > ## A. readGAlignments() > ## --------------------------------------------------------------------- > > ## Simple use: > bamfile <- system.file("extdata", "ex1.bam", package="Rsamtools", + mustWork=TRUE) > gal1 <- readGAlignments(bamfile) > gal1 GAlignments object with 3271 alignments and 0 metadata columns: seqnames strand cigar qwidth start end width [1] seq1 + 36M 36 1 36 36 [2] seq1 + 35M 35 3 37 35 [3] seq1 + 35M 35 5 39 35 [4] seq1 + 36M 36 6 41 36 [5] seq1 + 35M 35 9 43 35 ... ... ... ... ... ... ... ... [3267] seq2 + 35M 35 1524 1558 35 [3268] seq2 + 35M 35 1524 1558 35 [3269] seq2 - 35M 35 1528 1562 35 [3270] seq2 - 35M 35 1532 1566 35 [3271] seq2 - 35M 35 1533 1567 35 njunc [1] 0 [2] 0 [3] 0 [4] 0 [5] 0 ... ... [3267] 0 [3268] 0 [3269] 0 [3270] 0 [3271] 0 ------- seqinfo: 2 sequences from an unspecified genome > names(gal1) NULL > > ## Using the 'use.names' arg: > gal2 <- readGAlignments(bamfile, use.names=TRUE) > gal2 GAlignments object with 3271 alignments and 0 metadata columns: seqnames strand cigar qwidth start B7_591:4:96:693:509 seq1 + 36M 36 1 EAS54_65:7:152:368:113 seq1 + 35M 35 3 EAS51_64:8:5:734:57 seq1 + 35M 35 5 B7_591:1:289:587:906 seq1 + 36M 36 6 EAS56_59:8:38:671:758 seq1 + 35M 35 9 ... ... ... ... ... ... EAS56_63:4:141:9:811 seq2 + 35M 35 1524 EAS114_30:6:277:397:932 seq2 + 35M 35 1524 EAS139_11:7:50:1229:1313 seq2 - 35M 35 1528 EAS54_65:3:320:20:250 seq2 - 35M 35 1532 EAS114_26:7:37:79:581 seq2 - 35M 35 1533 end width njunc B7_591:4:96:693:509 36 36 0 EAS54_65:7:152:368:113 37 35 0 EAS51_64:8:5:734:57 39 35 0 B7_591:1:289:587:906 41 36 0 EAS56_59:8:38:671:758 43 35 0 ... ... ... ... EAS56_63:4:141:9:811 1558 35 0 EAS114_30:6:277:397:932 1558 35 0 EAS139_11:7:50:1229:1313 1562 35 0 EAS54_65:3:320:20:250 1566 35 0 EAS114_26:7:37:79:581 1567 35 0 ------- seqinfo: 2 sequences from an unspecified genome > head(names(gal2)) [1] "B7_591:4:96:693:509" "EAS54_65:7:152:368:113" "EAS51_64:8:5:734:57" [4] "B7_591:1:289:587:906" "EAS56_59:8:38:671:758" "EAS56_61:6:18:467:281" > > ## Using the 'param' arg to drop PCR or optical duplicates as well as > ## secondary alignments, and to load additional BAM fields: > param <- ScanBamParam(flag=scanBamFlag(isDuplicate=FALSE, + isSecondaryAlignment=FALSE), + what=c("qual", "flag")) > gal3 <- readGAlignments(bamfile, param=param) > gal3 GAlignments object with 3271 alignments and 2 metadata columns: seqnames strand cigar qwidth start end width [1] seq1 + 36M 36 1 36 36 [2] seq1 + 35M 35 3 37 35 [3] seq1 + 35M 35 5 39 35 [4] seq1 + 36M 36 6 41 36 [5] seq1 + 35M 35 9 43 35 ... ... ... ... ... ... ... ... [3267] seq2 + 35M 35 1524 1558 35 [3268] seq2 + 35M 35 1524 1558 35 [3269] seq2 - 35M 35 1528 1562 35 [3270] seq2 - 35M 35 1532 1566 35 [3271] seq2 - 35M 35 1533 1567 35 njunc | qual flag | [1] 0 | <<<<<<<<<<...<<<<<;:<;7 73 [2] 0 | <<<<<<<<<<...9<<3/:<6): 73 [3] 0 | <<<<<<<<<<...<3;);3*8/5 137 [4] 0 | (-&----,--...,+-,),''*, 137 [5] 0 | <<<<<<<<<<...<<7<<;:<5% 137 ... ... . ... ... [3267] 0 | <<<;<<<<<<...<<<..));;. 137 [3268] 0 | <<<<<<<<<<...<<<:8(,0%( 73 [3269] 0 | <<<<,<&<7<...<<<<<<<<<< 83 [3270] 0 | +'''/<<<<7...<<<<<<<<<< 147 [3271] 0 | 3,,,===6==...========== 83 ------- seqinfo: 2 sequences from an unspecified genome > mcols(gal3) DataFrame with 3271 rows and 2 columns qual flag 1 <<<<<<<<<<...<<<<<;:<;7 73 2 <<<<<<<<<<...9<<3/:<6): 73 3 <<<<<<<<<<...<3;);3*8/5 137 4 (-&----,--...,+-,),''*, 137 5 <<<<<<<<<<...<<7<<;:<5% 137 ... ... ... 3267 <<<;<<<<<<...<<<..));;. 137 3268 <<<<<<<<<<...<<<:8(,0%( 73 3269 <<<<,<&<7<...<<<<<<<<<< 83 3270 +'''/<<<<7...<<<<<<<<<< 147 3271 3,,,===6==...========== 83 > > ## Using the 'param' arg to load alignments from particular regions. > which <- IRangesList(seq1=IRanges(1000, 1100), + seq2=IRanges(c(1546, 1555, 1567), width=10)) > param <- ScanBamParam(which=which) > gal4 <- readGAlignments(bamfile, use.names=TRUE, param=param) > gal4 GAlignments object with 205 alignments and 0 metadata columns: seqnames strand cigar qwidth start EAS114_26:7:86:308:648 seq1 + 35M 35 970 EAS56_65:8:206:563:262 seq1 + 35M 35 971 EAS56_65:6:82:822:767 seq1 + 35M 35 972 EAS56_57:5:207:926:427 seq1 + 35M 35 973 EAS51_62:7:144:28:475 seq1 + 35M 35 974 ... ... ... ... ... ... EAS114_30:6:277:397:932 seq2 + 35M 35 1524 EAS139_11:7:50:1229:1313 seq2 - 35M 35 1528 EAS54_65:3:320:20:250 seq2 - 35M 35 1532 EAS114_26:7:37:79:581 seq2 - 35M 35 1533 EAS114_26:7:37:79:581 seq2 - 35M 35 1533 end width njunc EAS114_26:7:86:308:648 1004 35 0 EAS56_65:8:206:563:262 1005 35 0 EAS56_65:6:82:822:767 1006 35 0 EAS56_57:5:207:926:427 1007 35 0 EAS51_62:7:144:28:475 1008 35 0 ... ... ... ... EAS114_30:6:277:397:932 1558 35 0 EAS139_11:7:50:1229:1313 1562 35 0 EAS54_65:3:320:20:250 1566 35 0 EAS114_26:7:37:79:581 1567 35 0 EAS114_26:7:37:79:581 1567 35 0 ------- seqinfo: 2 sequences from an unspecified genome > > ## IMPORTANT NOTE: A given record is loaded one time for each region > ## it overlaps with. We call this "duplicated record selection" (this > ## is a scanBam() feature, readGAlignments() is based on scanBam()): > which <- IRangesList(seq2=IRanges(c(1555, 1567), width=10)) > param <- ScanBamParam(which=which) > gal5 <- readGAlignments(bamfile, use.names=TRUE, param=param) > gal5 # record EAS114_26:7:37:79:581 was loaded twice GAlignments object with 9 alignments and 0 metadata columns: seqnames strand cigar qwidth start EAS54_71:8:105:854:975 seq2 - 33M 33 1523 EAS51_62:4:187:907:145 seq2 - 35M 35 1524 EAS54_71:4:284:269:882 seq2 + 34M 34 1524 EAS56_63:4:141:9:811 seq2 + 35M 35 1524 EAS114_30:6:277:397:932 seq2 + 35M 35 1524 EAS139_11:7:50:1229:1313 seq2 - 35M 35 1528 EAS54_65:3:320:20:250 seq2 - 35M 35 1532 EAS114_26:7:37:79:581 seq2 - 35M 35 1533 EAS114_26:7:37:79:581 seq2 - 35M 35 1533 end width njunc EAS54_71:8:105:854:975 1555 33 0 EAS51_62:4:187:907:145 1558 35 0 EAS54_71:4:284:269:882 1557 34 0 EAS56_63:4:141:9:811 1558 35 0 EAS114_30:6:277:397:932 1558 35 0 EAS139_11:7:50:1229:1313 1562 35 0 EAS54_65:3:320:20:250 1566 35 0 EAS114_26:7:37:79:581 1567 35 0 EAS114_26:7:37:79:581 1567 35 0 ------- seqinfo: 2 sequences from an unspecified genome > > ## This becomes clearer if we use 'with.which_label=TRUE' to identify > ## the region in 'which' where each element in 'gal5' originates from. > gal5 <- readGAlignments(bamfile, use.names=TRUE, param=param, + with.which_label=TRUE) > gal5 GAlignments object with 9 alignments and 1 metadata column: seqnames strand cigar qwidth start EAS54_71:8:105:854:975 seq2 - 33M 33 1523 EAS51_62:4:187:907:145 seq2 - 35M 35 1524 EAS54_71:4:284:269:882 seq2 + 34M 34 1524 EAS56_63:4:141:9:811 seq2 + 35M 35 1524 EAS114_30:6:277:397:932 seq2 + 35M 35 1524 EAS139_11:7:50:1229:1313 seq2 - 35M 35 1528 EAS54_65:3:320:20:250 seq2 - 35M 35 1532 EAS114_26:7:37:79:581 seq2 - 35M 35 1533 EAS114_26:7:37:79:581 seq2 - 35M 35 1533 end width njunc | which_label | EAS54_71:8:105:854:975 1555 33 0 | seq2:1555-1564 EAS51_62:4:187:907:145 1558 35 0 | seq2:1555-1564 EAS54_71:4:284:269:882 1557 34 0 | seq2:1555-1564 EAS56_63:4:141:9:811 1558 35 0 | seq2:1555-1564 EAS114_30:6:277:397:932 1558 35 0 | seq2:1555-1564 EAS139_11:7:50:1229:1313 1562 35 0 | seq2:1555-1564 EAS54_65:3:320:20:250 1566 35 0 | seq2:1555-1564 EAS114_26:7:37:79:581 1567 35 0 | seq2:1555-1564 EAS114_26:7:37:79:581 1567 35 0 | seq2:1567-1576 ------- seqinfo: 2 sequences from an unspecified genome > > ## Not surprisingly, we also get "duplicated record selection" when > ## 'which' contains repeated or overlapping regions. Using the same > ## regions as we did for 'gal4' above, except that now we're > ## repeating the region on seq1: > which <- IRangesList(seq1=rep(IRanges(1000, 1100), 2), + seq2=IRanges(c(1546, 1555, 1567), width=10)) > param <- ScanBamParam(which=which) > gal4b <- readGAlignments(bamfile, use.names=TRUE, param=param) > length(gal4b) # > length(gal4), because all the records overlapping [1] 366 > # with bases 1000 to 1100 on seq1 are now duplicated > > ## The "duplicated record selection" will artificially increase the > ## coverage or affect other downstream results. It can be mitigated > ## (but not completely eliminated) by first "reducing" the set of > ## regions: > which <- reduce(which) > which IRangesList object of length 2: $seq1 IRanges object with 1 range and 0 metadata columns: start end width [1] 1000 1100 101 $seq2 IRanges object with 2 ranges and 0 metadata columns: start end width [1] 1546 1564 19 [2] 1567 1576 10 > param <- ScanBamParam(which=which) > gal4c <- readGAlignments(bamfile, use.names=TRUE, param=param) > length(gal4c) # < length(gal4), because the 2 first original regions [1] 197 > # on seq2 were merged into a single one > > ## Note that reducing the set of regions didn't completely eliminate > ## "duplicated record selection". Records that overlap the 2 reduced > ## regions on seq2 (which$seq2) are loaded twice (like for 'gal5' > ## above). See example D. below for how to completely eliminate > ## "duplicated record selection". > > ## Using the 'param' arg to load tags. Except for MF and Aq, the tags > ## specified below are predefined tags (see the SAM Spec for the list > ## of predefined tags and their meaning). > param <- ScanBamParam(tag=c("MF", "Aq", "NM", "UQ", "H0", "H1"), + what="isize") > gal6 <- readGAlignments(bamfile, param=param) > mcols(gal6) # "tag" cols always after "what" cols DataFrame with 3271 rows and 7 columns isize MF Aq NM UQ H0 H1 1 NA 18 73 0 0 1 0 2 NA 18 66 0 0 1 0 3 NA 18 66 0 0 1 0 4 NA 130 63 5 38 0 0 5 NA 18 72 0 0 1 0 ... ... ... ... ... ... ... ... 3267 NA 32 0 3 47 2 27 3268 NA 32 0 3 42 2 85 3269 -187 18 0 1 11 3 7 3270 -200 18 6 2 24 1 2 3271 -219 18 27 2 23 0 1 > > ## With a BamViews object: > fls <- system.file("extdata", "ex1.bam", package="Rsamtools", + mustWork=TRUE) > bv <- BamViews(fls, + bamSamples=DataFrame(info="test", row.names="ex1"), + auto.range=TRUE) > ## Note that the "readGAlignments" method for BamViews objects > ## requires the ShortRead package to be installed. > aln <- readGAlignments(bv) > aln List of length 1 names(1): ex1 > aln[[1]] GAlignments object with 3271 alignments and 0 metadata columns: seqnames strand cigar qwidth start end width [1] seq1 + 36M 36 1 36 36 [2] seq1 + 35M 35 3 37 35 [3] seq1 + 35M 35 5 39 35 [4] seq1 + 36M 36 6 41 36 [5] seq1 + 35M 35 9 43 35 ... ... ... ... ... ... ... ... [3267] seq2 + 35M 35 1524 1558 35 [3268] seq2 + 35M 35 1524 1558 35 [3269] seq2 - 35M 35 1528 1562 35 [3270] seq2 - 35M 35 1532 1566 35 [3271] seq2 - 35M 35 1533 1567 35 njunc [1] 0 [2] 0 [3] 0 [4] 0 [5] 0 ... ... [3267] 0 [3268] 0 [3269] 0 [3270] 0 [3271] 0 ------- seqinfo: 2 sequences from an unspecified genome > aln[colnames(bv)] List of length 1 names(1): ex1 > mcols(aln) DataFrame with 1 row and 1 column info ex1 test > > ## --------------------------------------------------------------------- > ## B. readGAlignmentPairs() > ## --------------------------------------------------------------------- > galp1 <- readGAlignmentPairs(bamfile) > head(galp1) GAlignmentPairs object with 6 pairs, strandMode=1, and 0 metadata columns: seqnames strand : ranges -- ranges : -- [1] seq1 + : 36-70 -- 185-219 [2] seq1 + : 49-84 -- 224-259 [3] seq1 + : 51-85 -- 228-262 [4] seq1 + : 60-95 -- 224-259 [5] seq1 + : 60-94 -- 235-269 [6] seq1 + : 61-95 -- 248-282 ------- seqinfo: 2 sequences from an unspecified genome > names(galp1) NULL > > ## Here we use the 'param' arg to filter by proper pair, drop PCR / > ## optical duplicates, and drop secondary alignments. Filtering by > ## proper pair and dropping secondary alignments can help make the > ## pairing algorithm run significantly faster: > param <- ScanBamParam(flag=scanBamFlag(isProperPair=TRUE, + isDuplicate=FALSE, + isSecondaryAlignment=FALSE)) > galp2 <- readGAlignmentPairs(bamfile, use.names=TRUE, param=param) > galp2 GAlignmentPairs object with 1572 pairs, strandMode=1, and 0 metadata columns: seqnames strand : ranges -- ranges : -- EAS54_61:4:143:69:578 seq1 + : 36-70 -- 185-219 B7_593:4:106:316:452 seq1 + : 49-84 -- 224-259 EAS54_65:3:321:311:983 seq1 + : 51-85 -- 228-262 B7_591:5:42:540:501 seq1 + : 60-95 -- 224-259 EAS192_3:5:223:142:410 seq1 + : 60-94 -- 235-269 ... ... ... ... ... ... ... EAS139_11:5:52:1278:1478 seq2 - : 1513-1547 -- 1322-1356 EAS1_97:4:274:287:423 seq2 - : 1515-1549 -- 1332-1366 EAS54_71:8:105:854:975 seq2 - : 1523-1555 -- 1354-1388 EAS139_11:7:50:1229:1313 seq2 - : 1528-1562 -- 1376-1410 EAS114_26:7:37:79:581 seq2 - : 1533-1567 -- 1349-1383 ------- seqinfo: 2 sequences from an unspecified genome > head(galp2) GAlignmentPairs object with 6 pairs, strandMode=1, and 0 metadata columns: seqnames strand : ranges -- ranges : -- EAS54_61:4:143:69:578 seq1 + : 36-70 -- 185-219 B7_593:4:106:316:452 seq1 + : 49-84 -- 224-259 EAS54_65:3:321:311:983 seq1 + : 51-85 -- 228-262 B7_591:5:42:540:501 seq1 + : 60-95 -- 224-259 EAS192_3:5:223:142:410 seq1 + : 60-94 -- 235-269 EAS56_61:6:227:259:597 seq1 + : 61-95 -- 248-282 ------- seqinfo: 2 sequences from an unspecified genome > head(names(galp2)) [1] "EAS54_61:4:143:69:578" "B7_593:4:106:316:452" "EAS54_65:3:321:311:983" [4] "B7_591:5:42:540:501" "EAS192_3:5:223:142:410" "EAS56_61:6:227:259:597" > > ## --------------------------------------------------------------------- > ## C. readGAlignmentsList() > ## --------------------------------------------------------------------- > library(pasillaBamSubset) > > ## 'file' as character. > bam <- untreated3_chr4() > galist1 <- readGAlignmentsList(bam) > galist1[1:3] GAlignmentsList object of length 3: [[1]] GAlignments object with 2 alignments and 0 metadata columns: seqnames strand cigar qwidth start end width [1] chr4 + 37M 37 169 205 37 [2] chr4 - 37M 37 326 362 37 njunc [1] 0 [2] 0 ------- seqinfo: 8 sequences from an unspecified genome [[2]] GAlignments object with 2 alignments and 0 metadata columns: seqnames strand cigar qwidth start end width [1] chr4 + 37M 37 946 982 37 [2] chr4 - 37M 37 986 1022 37 njunc [1] 0 [2] 0 ------- seqinfo: 8 sequences from an unspecified genome [[3]] GAlignments object with 2 alignments and 0 metadata columns: seqnames strand cigar qwidth start end width [1] chr4 + 37M 37 943 979 37 [2] chr4 - 37M 37 1086 1122 37 njunc [1] 0 [2] 0 ------- seqinfo: 8 sequences from an unspecified genome > length(galist1) [1] 96636 > table(elementNROWS(galist1)) 1 2 3 4 5 6 7 8 9 18191 78297 69 60 7 8 2 1 1 > > ## When 'file' is a BamFile, 'asMates' must be TRUE. If FALSE, > ## the data are treated as single-end and each list element of the > ## GAlignmentsList will be of length 1. For single-end data > ## use readGAlignments(). > bamfile <- BamFile(bam, yieldSize=3, asMates=TRUE) > readGAlignmentsList(bamfile) GAlignmentsList object of length 3: [[1]] GAlignments object with 2 alignments and 0 metadata columns: seqnames strand cigar qwidth start end width [1] chr4 + 37M 37 169 205 37 [2] chr4 - 37M 37 326 362 37 njunc [1] 0 [2] 0 ------- seqinfo: 8 sequences from an unspecified genome [[2]] GAlignments object with 2 alignments and 0 metadata columns: seqnames strand cigar qwidth start end width [1] chr4 + 37M 37 946 982 37 [2] chr4 - 37M 37 986 1022 37 njunc [1] 0 [2] 0 ------- seqinfo: 8 sequences from an unspecified genome [[3]] GAlignments object with 2 alignments and 0 metadata columns: seqnames strand cigar qwidth start end width [1] chr4 + 37M 37 943 979 37 [2] chr4 - 37M 37 1086 1122 37 njunc [1] 0 [2] 0 ------- seqinfo: 8 sequences from an unspecified genome > > ## Use a 'param' to fine tune the results. > param <- ScanBamParam(flag=scanBamFlag(isProperPair=TRUE)) > galist2 <- readGAlignmentsList(bam, param=param) > galist2[1:3] GAlignmentsList object of length 3: [[1]] GAlignments object with 2 alignments and 0 metadata columns: seqnames strand cigar qwidth start end width [1] chr4 + 37M 37 169 205 37 [2] chr4 - 37M 37 326 362 37 njunc [1] 0 [2] 0 ------- seqinfo: 8 sequences from an unspecified genome [[2]] GAlignments object with 2 alignments and 0 metadata columns: seqnames strand cigar qwidth start end width [1] chr4 + 37M 37 946 982 37 [2] chr4 - 37M 37 986 1022 37 njunc [1] 0 [2] 0 ------- seqinfo: 8 sequences from an unspecified genome [[3]] GAlignments object with 2 alignments and 0 metadata columns: seqnames strand cigar qwidth start end width [1] chr4 + 37M 37 943 979 37 [2] chr4 - 37M 37 1086 1122 37 njunc [1] 0 [2] 0 ------- seqinfo: 8 sequences from an unspecified genome > length(galist2) [1] 45828 > table(elementNROWS(galist2)) 2 45828 > > ## For paired-end data, we can set the 'strandMode' parameter to > ## infer the strand of a pair from the strand of the first and > ## last alignments in the pair > galist3 <- readGAlignmentsList(bam, param=param, strandMode=0) > galist3[1:3] GAlignmentsList object of length 3: [[1]] GAlignments object with 2 alignments and 0 metadata columns: seqnames strand cigar qwidth start end width [1] chr4 * 37M 37 169 205 37 [2] chr4 * 37M 37 326 362 37 njunc [1] 0 [2] 0 ------- seqinfo: 8 sequences from an unspecified genome [[2]] GAlignments object with 2 alignments and 0 metadata columns: seqnames strand cigar qwidth start end width [1] chr4 * 37M 37 946 982 37 [2] chr4 * 37M 37 986 1022 37 njunc [1] 0 [2] 0 ------- seqinfo: 8 sequences from an unspecified genome [[3]] GAlignments object with 2 alignments and 0 metadata columns: seqnames strand cigar qwidth start end width [1] chr4 * 37M 37 943 979 37 [2] chr4 * 37M 37 1086 1122 37 njunc [1] 0 [2] 0 ------- seqinfo: 8 sequences from an unspecified genome > galist4 <- readGAlignmentsList(bam, param=param, strandMode=1) > galist4[1:3] GAlignmentsList object of length 3: [[1]] GAlignments object with 2 alignments and 0 metadata columns: seqnames strand cigar qwidth start end width [1] chr4 + 37M 37 169 205 37 [2] chr4 + 37M 37 326 362 37 njunc [1] 0 [2] 0 ------- seqinfo: 8 sequences from an unspecified genome [[2]] GAlignments object with 2 alignments and 0 metadata columns: seqnames strand cigar qwidth start end width [1] chr4 + 37M 37 946 982 37 [2] chr4 + 37M 37 986 1022 37 njunc [1] 0 [2] 0 ------- seqinfo: 8 sequences from an unspecified genome [[3]] GAlignments object with 2 alignments and 0 metadata columns: seqnames strand cigar qwidth start end width [1] chr4 + 37M 37 943 979 37 [2] chr4 + 37M 37 1086 1122 37 njunc [1] 0 [2] 0 ------- seqinfo: 8 sequences from an unspecified genome > galist5 <- readGAlignmentsList(bam, param=param, strandMode=2) > galist5[1:3] GAlignmentsList object of length 3: [[1]] GAlignments object with 2 alignments and 0 metadata columns: seqnames strand cigar qwidth start end width [1] chr4 - 37M 37 169 205 37 [2] chr4 - 37M 37 326 362 37 njunc [1] 0 [2] 0 ------- seqinfo: 8 sequences from an unspecified genome [[2]] GAlignments object with 2 alignments and 0 metadata columns: seqnames strand cigar qwidth start end width [1] chr4 - 37M 37 946 982 37 [2] chr4 - 37M 37 986 1022 37 njunc [1] 0 [2] 0 ------- seqinfo: 8 sequences from an unspecified genome [[3]] GAlignments object with 2 alignments and 0 metadata columns: seqnames strand cigar qwidth start end width [1] chr4 - 37M 37 943 979 37 [2] chr4 - 37M 37 1086 1122 37 njunc [1] 0 [2] 0 ------- seqinfo: 8 sequences from an unspecified genome > > ## --------------------------------------------------------------------- > ## D. COMPARING 4 STRATEGIES FOR LOADING THE ALIGNMENTS THAT OVERLAP > ## WITH THE EXONIC REGIONS ON FLY CHROMMOSOME 4 > ## --------------------------------------------------------------------- > library(pasillaBamSubset) > bam <- untreated1_chr4() > > library(TxDb.Dmelanogaster.UCSC.dm3.ensGene) Loading required package: GenomicFeatures Loading required package: AnnotationDbi > txdb <- TxDb.Dmelanogaster.UCSC.dm3.ensGene > ex <- exons(txdb) > seqlevels(ex, pruning.mode="coarse") <- "chr4" > length(ex) [1] 1002 > > ## Some of the exons overlap with each other: > isDisjoint(ex) # FALSE [1] FALSE > exonic_regions <- reduce(ex) > isDisjoint(exonic_regions) # no more overlaps [1] TRUE > length(exonic_regions) [1] 786 > > ## Strategy #1: slow and loads a lot of records more than once (see > ## "duplicated record selection" in example A. above). > param1 <- ScanBamParam(which=ex) > gal1 <- readGAlignments(bam, param=param1) * checking for unstated dependencies in ‘tests’ ... OK * checking tests ... Running ‘run_unitTests.R’ OK * checking for unstated dependencies in vignettes ... OK * checking package vignettes in ‘inst/doc’ ... OK * checking running R code from vignettes ... SKIPPED * checking re-building of vignette outputs ... SKIPPED * checking PDF version of manual ... OK * DONE Status: 1 ERROR, 1 WARNING, 5 NOTEs See ‘/home/biocbuild/bbs-3.18-bioc/meat/GenomicAlignments.Rcheck/00check.log’ for details.