computeIA {SemDist}R Documentation

Compute information accretion for an ontology

Description

Calculates information accretion for each term in the specified ontology using either user -specified data or the sequence annotations for the organisms specified (note that organism-specific pacakges must be downloaded separately. See "note" section).

Usage

computeIA(ont, organism, evcodes = NULL, specify.ont = FALSE, 
          myont = NULL, specify.annotations = FALSE, 
          annotfile = NULL)

Arguments

ont

Character representation of ontology version to use. One of "CC", "MF", or "BP" , corresponding to Cellular Component, Molecular Function, and Biological Process.

organism

A character vector indicating which organism's annotation data to use.

evcodes

A character vector specifying which evidence codes to use in the ontology data. Default NULL value causes all codes to be used.

specify.ont

A boolean indicating whether the user wants to specify their own version of the ontology.

myont

Character object indicating what file to read in the specified ontology from. The ontology should be specified as a tab-delimited file with 2 columns (no header). Each row in the file should indicate a parent-child relationship between two GO accessions (e.g. "GO:0003674 GO:0004000")

specify.annotations

Boolean indicating whether the user wants to specify sequence annotations from a file. Should only be TRUE if specify.ont is TRUE.

annotfile

Character object indicating which file to read sequence annotations from. Should be a tab-delimited file with 2 columns. The first column is a list of sequences, the second is a list of GO accessions in the same rows as the sequences they annotate.

Value

Does not return a specific value. Saves the information accretion values for each term in the ontology in a .rda file that specifies the organism and the ont version. Parent count and term count objects are also saved in similarly formatted files so that IA calculations from multiple organisms can be combined.

Note

In order to compute IA for an organism, the specific annotation data set for that organism must be installed by the user. Here is a list of supported organisms (names in the format that should be passed to computeIA) and the correspoinding packages needed:

anopheles = org.Ag.eg.db

arabidopsis = org.At.tair.db

bovine = org.Bt.eg.db

canine = org.Cf.eg.db

chicken = org.Gg.eg.db

chimp = org.Pt.eg.db

ecolik12 = org.EcK12.eg.db

fly = org.Dm.eg.db

human = org.Hs.eg.db

malaria = org.Pf.plasmo.db

mouse = org.Mm.eg.db

pig = org.Ss.eg.db

rat = org.Rn.eg.db

rhesus = org.Mmu.eg.db

worm = org.Ce.eg.db

xenopus = org.Xl.eg.db

yeast = org.Sc.sgd.db

zebrafish = org.Dr.eg.db

Author(s)

Ian Gonzalez and Wyatt Clark

See Also

RUMIcurve findRUMI

Examples

          
# Calculate IA, specify ontology and annotations
ontfile <- system.file("extdata", "mfo_ontology.txt", package="SemDist")
annotations <- system.file("extdata", "MFO_LABELS_TEST.txt", package="SemDist")
computeIA("my", "values", specify.ont=TRUE, 
          myont=ontfile, specify.annotations=TRUE, 
          annotfile=annotations)

[Package SemDist version 1.28.0 Index]