extractConsequenceGenomeWide {IsoformSwitchAnalyzeR} | R Documentation |
This function enables a genome wide analysis of changes in isoform usage of isoforms with a common annotaiton.
Specifically this function extract isoforms of interest and for each category of annotation (such as signal peptides) the global distribution of IF (measuring isoform usage) are plottet for each subset of features in that category (e.g with and without signal peptides). This enables a gplobal analysis of isoforms with a common annotation. The annotation considdered are (if added to the switchAnalyzeRlist) coding potential, intron retentions, isoform clas code (Cufflinks/Cuffdiff data only), NMD status, ORFs, protein domains, signal oeptides and whether switch consequences were identified.
The isoforms of interest can either be defined by isoforms form gene differentially expressed, isoform that are differential expressed or isoforms from genes with isoform switching - as controled by featureToExtract
.
This function offers both vizualization of the result as well as analysis via summary statistics of the comparisons.
extractConsequenceGenomeWide( switchAnalyzeRlist, featureToExtract = 'isoformUsage', annotationToAnalyze = 'all', alpha=0.05, dIFcutoff = 0.1, log2FCcutoff = 1, violinPlot=TRUE, alphas=c(0.05, 0.001), localTheme=theme_bw(), plot=TRUE, returnResult=TRUE ) extractGenomeWideAnalysis( switchAnalyzeRlist, featureToExtract = 'isoformUsage', annotationToAnalyze = 'all', alpha=0.05, dIFcutoff = 0.1, log2FCcutoff = 1, violinPlot=TRUE, alphas=c(0.05, 0.001), localTheme=theme_bw(), plot=TRUE, returnResult=TRUE )
switchAnalyzeRlist |
A |
featureToExtract |
This argument, given as a string, defines the set isoforms which should be analyzed. The advailable options are:
|
annotationToAnalyze |
A vector of strings indicating what categories of annotation to analyze. Annotation types given here but not (yet) analyzed in the |
alpha |
The cutoff which the FDR correct p-values (q-values) must be smaller than for calling significant switches. Defualit is 0.05. |
dIFcutoff |
The cutoff which the changes in (absolute) isoform usage must be larger than before an isoform is considered switching. This cutoff can remove cases where isoforms with (very) low dIF values are deemed signicant and thereby included in the downstream analysis. This cutoff is analogous to having a cutoff on log2 fold change in a normal differential expression analysis of genes to ensure the genes have a certain effect size. Default is 0.1 (10%). |
log2FCcutoff |
The cutoff which the changes in (absolute) isoform or gene expression must be larger than before an isoform is considered for inclusion. |
violinPlot |
A logical indicating whether to make a violin plots (if TRUE) or boxplots (if FALSE). Violin plots will always have added 3 black dots, one of each of the 25th, 50th (median) and 75th percentile of the data. Default is TRUE. |
alphas |
A numeric vector of length two giving the significance levels represented in plots. The numbers indicate the q-value cutoff for significant (*) and highly significant (***) respecitively. Default 0.05 and 0.001 which should be interpret as q<0.05 and q<0.001 respectively). If q-values are higher than this they will be annotated as 'ns' (not significant). |
localTheme |
General ggplo2 theme with which the plot is made, see |
plot |
A logical indicating whether to generate the plot (if TRUE) not (if FALSE). Default is TRUE. |
returnResult |
A logical indicating whether to return a data.frame with summary statistics of the comparisons (if TRUE) or not (if FALSE). Default is TRUE. |
extractGenomeWideAnalysis
is just a wrapper for extractGenomeWideConsequenceAnalysis
included for backward compatability.
Changes in isoform usage are measure as the difference in isoform fraction (dIF) values, where isoform fraction (IF) values are calculated as <isoform_exp> / <gene_exp>.
The significance test is performed with R's build in wilcox.test()
(aka 'Mann-Whitney-U') with default parameters and resulting p-values are corrected via p.adjust() using FDR (Benjamini-Hochberg).
The arguments passed to annotationToAnalyze
must be a combination of:
isoform_class_code
: Devide transcripts based on differences in the transcript classification provide by cufflinks (only advailable for data imported from Cufflinks/Cuffdiff). For a updated list of class codes see http://cole-trapnell-lab.github.io/cufflinks/cuffcompare/#transfrag-class-codes.
coding_potential
: Devide transcripts based on differences in coding potential, as indicated by the CPAT analysis. Requires that importCPATanalysis
have been used to add external CPAT analysis to the switchAnalyzeRlist
.
intron_retention
: Devide transcripts based on pressence intron retentions (and their genomic positions). Require that analyzeIntronRetention
have been run.
ORF
: Devide transcripts based on whether an ORF is annotated or not. Requires that both the isoforms have been annotated with ORF either via identifyORF
or by supplying a GTF file and setting addAnnotatedORFs=TRUE
when creating the switchAnalyzeRlist.
NMD_status
: Devide transcripts based on differences in sensitivity to Nonsense Mediated Decay (NMD). Requires that both the isoforms have been annotated with PTC either via identifyORF
or by supplying a GTF file and setting addAnnotatedORFs=TRUE
when creating the switchAnalyzeRlist.
domains_identified
: Devide transcripts based on differences in the name and order of which domains are identified by the Pfam in the transcripts. Requires that importPFAManalysis
have been used to add external Pfam analysis to the switchAnalyzeRlist
. Requires that both the isoforms are annotated with a ORF either via identifyORF
or by supplying a GTF file and setting addAnnotatedORFs=TRUE
when creating the switchAnalyzeRlist.
signal_peptide_identified
: Devide transcripts based on differences in whether a signal peptide was identified or not by the SignalP analysis. Requires that analyzeSignalP
have been used to add external SignalP analysis to the switchAnalyzeRlist
. Requires that both the isoforms are annotated with a ORF either via analyzeORF
or by supplying a GTF file and setting addAnnotatedORFs=TRUE
when creating the switchAnalyzeRlist (and are thereby also affected by removeNoncodinORFs=TRUE
in analyzeCPAT
).
switch_consequences
: Whether the gene is involved in isoform switches with predicted consequences. Requires that analyzeSwitchConsequences
have been used).
If plot=TRUE
: A plot of the distribution of IF values as a function of the annotation and condition compared.
If returnResult=TRUE
: A data.frame with the summary statistics from the comparison of the two conditions with a wilcox.test.
Kristoffer Vitting-Seerup
Vitting-Seerup et al. The Landscape of Isoform Switches in Human Cancers. Mol. Cancer Res. (2017).
isoformSwitchTestDEXSeq
isoformSwitchTestDRIMSeq
analyzeORF
analyzeAlternativeSplicing
analyzeCPAT
analyzePFAM
analyzeSignalP
analyzeSwitchConsequences
extractConsequenceEnrichment
extractConsequenceEnrichmentComparison
### Load example data data("exampleSwitchListAnalyzed") ### make the genome wide analysis symmaryStatistics <- extractConsequenceGenomeWide( switchAnalyzeRlist = exampleSwitchListAnalyzed, featureToExtract = 'isoformUsage', # alternatives are 'isoformExp' and 'geneExp' plot=TRUE, returnResult = TRUE )