aggregate_duplicates {tidybulk} | R Documentation |
aggregate_duplicates() takes as imput a 'tbl' formatted as | <SAMPLE> | <TRANSCRIPT> | <COUNT> | <...> | and returns a 'tbl' with aggregated transcripts that were duplicated.
aggregate_duplicates( .data, .sample = NULL, .transcript = NULL, .abundance = NULL, aggregation_function = sum, keep_integer = TRUE ) ## S4 method for signature 'spec_tbl_df' aggregate_duplicates( .data, .sample = NULL, .transcript = NULL, .abundance = NULL, aggregation_function = sum, keep_integer = TRUE ) ## S4 method for signature 'tbl_df' aggregate_duplicates( .data, .sample = NULL, .transcript = NULL, .abundance = NULL, aggregation_function = sum, keep_integer = TRUE ) ## S4 method for signature 'tidybulk' aggregate_duplicates( .data, .sample = NULL, .transcript = NULL, .abundance = NULL, aggregation_function = sum, keep_integer = TRUE ) ## S4 method for signature 'SummarizedExperiment' aggregate_duplicates( .data, .sample = NULL, .transcript = NULL, .abundance = NULL, aggregation_function = sum, keep_integer = TRUE ) ## S4 method for signature 'RangedSummarizedExperiment' aggregate_duplicates( .data, .sample = NULL, .transcript = NULL, .abundance = NULL, aggregation_function = sum, keep_integer = TRUE )
.data |
A 'tbl' formatted as | <SAMPLE> | <TRANSCRIPT> | <COUNT> | <...> | |
.sample |
The name of the sample column |
.transcript |
The name of the transcript/gene column |
.abundance |
The name of the transcript/gene abundance column |
aggregation_function |
A function for counts aggregation (e.g., sum, median, or mean) |
keep_integer |
A boolean. Whether to force the aggregated counts to integer |
This function aggregates duplicated transcripts (e.g., isoforms, ensembl). For example, we often have to convert ensembl symbols to gene/transcript symbol, but in doing so we have to deal with duplicates. 'aggregate_duplicates' takes a tibble and column names (as symbols; for 'sample', 'transcript' and 'count') as arguments and returns a tibble with aggregate transcript with the same name. All the rest of the column are appended, and factors and boolean are appended as characters.
A 'tbl' object with aggregated transcript abundance and annotation
A 'tbl' object with aggregated transcript abundance and annotation
A 'tbl' object with aggregated transcript abundance and annotation
A 'tbl' object with aggregated transcript abundance and annotation
A 'SummarizedExperiment' object
A 'SummarizedExperiment' object
aggregate_duplicates( tidybulk::counts_mini, sample, transcript, `count`, aggregation_function = sum )