1 Basics

CytoML provides flowjo_to_gatingset function to parse FlowJo workspace (xml or wsp file) and FCS files into a self-contained GatingSet object, which captures the entire analysis recorded in flowJo, include compensation, transformation and gating.

1.1 open the workspace

library(CytoML)
dataDir <- system.file("extdata",package="flowWorkspaceData")
wsfile <- list.files(dataDir, pattern="manual.xml",full=TRUE)
ws <- open_flowjo_xml(wsfile)
ws
## File location:  /home/biocbuild/bbs-3.20-bioc/R/site-library/flowWorkspaceData/extdata/manual.xml 
## 
## Groups in Workspace
##          Name Num.Samples
## 1 All Samples          45
## 2      B-cell           4
## 3          DC           4
## 4      T-cell           4
## 5     Thelper           4
## 6        Treg           4

1.1.1 Extract groups from xml

Once opened, sample group information can be retrieved

tail(fj_ws_get_sample_groups(ws))
##    groupName groupID sampleID
## 60   Thelper       4       30
## 61   Thelper       4       31
## 62      Treg       5       37
## 63      Treg       5       38
## 64      Treg       5       39
## 65      Treg       5       40

1.1.2 Extract Samples from xml

And sample information for a given group

fj_ws_get_samples(ws, group_id = 5)
##   sampleID                    name  count pop.counts
## 1       28 CytoTrol_CytoTrol_1.fcs 136304         23
## 2       29 CytoTrol_CytoTrol_2.fcs 115827         23
## 3       30 CytoTrol_CytoTrol_3.fcs 123170         23
## 4       31 CytoTrol_CytoTrol_4.fcs 114802         23

1.1.3 Extract keywords from xml

keywords recorded in xml for a given sample

fj_ws_get_keywords(ws, 28)[1:5]
## $`$BEGINANALYSIS`
## [1] "0"
## 
## $`$BEGINDATA`
## [1] "3241"
## 
## $`$BEGINSTEXT`
## [1] "0"
## 
## $`$BTIM`
## [1] "13:28:46"
## 
## $`$BYTEORD`
## [1] "4,3,2,1"

1.2 Parse with default settings

In majority use cases, only two parameters are required to complete the parsing, i.e.

  • name: the group to import
  • path: the data path of FCS files.

1.2.1 select group

name parameter can be set to the group name displayed above through flowJo_workspace APIs.

gs <- flowjo_to_gatingset(ws, name = "T-cell")

name can also be the numeric index

gs <- flowjo_to_gatingset(ws, name = 4)

1.2.2 FCS path

As shown above, the path be omitted if fcs files are located at the same folder as xml file.

1.2.2.1 string

Otherwise, path is set the actual folder where FCS files are located. The folder can contain sub-folders and the parser will recursively look up the directory for FCS files (by matching the file names, keywords, etc)

gs <- flowjo_to_gatingset(ws, name = 4, path = dataDir)

1.2.2.2 data.frame

path can alternatively be a , which should contain two columns:‘sampleID’ and ‘file’. It essentially provides hardcoded mapping between ‘sampleID’ and FCS file (absolute) path to avoid the file system searching or sample matching process (between the flowJo sample reference and the FCS files).

However this is rarely needed since auto-searching does pretty accurate and robust matching.

2 Advanced

Due to the varieties of FlowJo workspace or FCS file issues, sometime the default setting won’t be sufficient to handle some edge cases, e.g. when the error occurs at specific gate due to the incorrect gate parameters defined in xml , but we want to be able to import the upstream gates that are still useful. Or there is letter case inconsistency for channels used in xml, which will trigger an error by default.

Also there are other features provided by the parser that allow users to speed up the parsing or extract more meta data from either xml or FCS files.

flowjo_to_gatingset provides more parameters that can be configured to solve different problems during the parsing. In document aims to go through these parameters and explore them one by one.

2.1 Parsing xml without loading FCS

It is possible to only import the gating structure without reading the FCS data by setting execute flag to FALSE.

gs <- flowjo_to_gatingset(ws, name = 4, execute = FALSE)
gs
## A GatingSet with 4 samples

Gating hierarchy is immediately available

suppressMessages(library(flowWorkspace))
plot(gs)