Contents

1 Introduction

The TCGA tumor types cover a collection of anatomical compartments. Organizing tumor types into groups of related compartments may be fruitful. We will use the oncotree OBO representation from an NCI thesaurus OBO distribution in the Bioc 3.9 version of ontoProc.

2 A table

This table was constructed by hand on Oct 10 2019 using materials in ontoProc package.

3 Formal annotation of anatomic site

3.1 Expeditious mapping

We will drop the CNTL class, and use only the first NCIT mapping when two seem to match.

##              code                  oncotr_site         ncit
## NCIT:C39851  BLCA Bladder Urothelial Carcinoma  NCIT:C39851
## NCIT:C132067  LGG             Low Grade Glioma NCIT:C132067
## NCIT:C3234   MESO                 Mesothelioma   NCIT:C3234
## NCIT:C3171    AML       Acute Myeloid Leukemia   NCIT:C3171
## NCIT:C4004   STAD       Gastric Adenocarcinoma   NCIT:C4004

We now have a 1-1 mapping from TCGA code to NCIT site. These sites can be grouped according to organ system, using the knowledge that NCIT:C3263 is the ‘neoplasm by site’ (which really should be ‘system’) category.

Neither thymoma nor mesothelioma have NCIT organ system mappings per se.

3.2 Aggregation

We now have 12 categories for 33 tumor types. A code pattern for finding the TCGA codes for a given system is:

##   code                   oncotr_site        ncit
## 1 CESC    Cervical Squamous Neoplasm NCIT:C40195
## 2   OV Ovarian Serous Adenocarcinoma  NCIT:C7550
## 3 PRAD       Prostate Adenocarcinoma  NCIT:C2919
## 4 TGCT    Testicular Germ Cell Tumor  NCIT:C8591
## 5  UCS        Uterine Carcinosarcoma NCIT:C42700
##                            sys
## 1 Reproductive System Neoplasm
## 2 Reproductive System Neoplasm
## 3 Reproductive System Neoplasm
## 4 Reproductive System Neoplasm
## 5 Reproductive System Neoplasm