1 Dataset: IFNg

This dataset contains RNA abundance data from a mouse study. RNA abundance was measured by RNA-Seq for two mouse groups: IFNG knockout mice and wild type mice. This demo dataset is a subset of the original data: 4 knockout mice and 4 wild type mice are included here. More information can be found in Greer, Renee L., Xiaoxi Dong, et al. 2016.

library(SBGNview.data)
library(SummarizedExperiment)
data("IFNg","cancer.ds")
head(assays(IFNg)$counts)
IFNg$group

2 Dataset cancer.ds (GSE16873)

This dataset is constructed using the first three samples of data gse16873.d in package pathview (i.e. columns “DCIS_1”,“DCIS_2” and “DCIS_3”). The original values were used without additional processing. It is constructed only for demonstration of SBGNview’s visualization ability, not for data analysis.

head(assays(cancer.ds)$counts)

3 ID mapping files

SBGNview.data package contains multiple pairwise ID mapping tables.

3.1 Molecule (or gene) IDs to glyphs IDs in SBGN-ML files:

SBGNview collected SBGN-ML files are generated from Biopax files, the glyph IDs in SBGN-ML files correspond to the IDs of XML element “Protein” in Biopax files. An example Biopax file: http://www.pathwaycommons.org/archives/PC3/v10/PathwayCommons10.reactome.BIOPAX.owl.gz A “Protein” XML element in a Biopax file contains children elements “UnificationXref” that record molecule IDs of this “Protein”. Thus, the ID mappings between molecule IDs and glyph IDs are extracted accordingly: Glyph IDs (e.g. pathwayCommons) are extracted from the ID of each XML element “Protein”. Its matching molecule ID (e.g. ENTREZID) is extracted from the corresponding XML child element “UnificationXref” of this “Protein” element. Normally, the molecule ID is from a single species where this pathway is defined (e.g. human: KEGG code ‘hsa’).

# hsa ID <=> glyph ID
data(hsa_ENTREZID_pathwayCommons)
head(hsa_ENTREZID_pathwayCommons)

In this example, the mapping tables only have molecule IDs from human (hsa): hsa ID <=> glyph ID. We mapped molecule IDs of other species (e.g. mouse: KEGG code ‘mmu’) to glyph IDs through the molecule ID in the reference pathway (e.g. hsa), that is, combining two tables: e.g. 1.hsa ID <=> glyph ID (the above example) ; 2. mmu ID <=> hsa ID (see below).

3.2 Molecule (or gene) IDs to KEGG ortholog IDs

The table mmu ID <=> hsa ID is generated by KEGG ortholog group definition: e.g. http://rest.kegg.jp/link/genes/K00500

data(mmu_KO_ENTREZID)
# mmu ID <=> KO
head(mmu_KO_ENTREZID)
data(hsa_KO_ENTREZID)
# hsa ID <=> KO
head(hsa_KO_ENTREZID)

3.3 Molecule (or gene) IDs to pathway IDs

They are generated from merging two tables: 1. Molecule ID <=> glyph ID (e.g. hsa_ENTREZID_pathwayCommons, see above); 2. glyph ID <=> pathway ID (see below). The table glyph ID <=> pathway ID is generated by parsing a SBGN-ML file, where we know the pathway ID and its component glyph IDs.

# Compound ID: chebi
data(chebi_pathwayCommons)
head(chebi_pathwayCommons)
data(chebi_pathway.id)
head(chebi_pathway.id)

Supporting Datasets for SBGNview Package

Xiaoxi Dong

Kovidh Vegesna, kvegesna (AT) uncc.edu