MSstatsTMT : A package for protein significance analysis in shotgun mass spectrometry-based proteomic experiments with tandem mass tag (TMT) labeling

Ting Huang (thuang0703@gmail.com), Meena Choi (mnchoi67@gmail.com), Sicheng Hao (hao.sic@husky.neu.edu)

2019-02-27

library(MSstatsTMT)

This vignette summarizes the introduction and various options of all functionalities in MSstatsTMT.

MSstatsTMT includes the following three steps for statistical testing:

  1. Converters for different peptide quantification tools to get the input with required format: PDtoMSstatsTMTFormat, MaxQtoMSstatsTMTFormat and SpectroMinetoMSstatsTMTFormat.
  2. Protein summarization based on peptide quantification data: proteinSummarization
  3. Group comparison on protein quantification data: groupComparisonTMT

1. Converters for different peptide quantification tools

PDtoMSstatsTMTFormat()

Preprocess PSM data from Proteome Discoverer and convert into the required input format for MSstatsTMT.

Arguments

  • input : data name of Proteome discover PSM output. Read PSM sheet.
  • annotation : data frame which contains column Run, Channel, Condition, BioReplicate, Mixture.
  • fraction : indicates whether the data has fractions. If there are fractions, then overlapped peptide ions will be removed and then fractions are combined for each biological mixture.
  • which.proteinid : Use Protein.Accessions(default) column for protein name. Master.Protein.Accessions can be used instead.
  • useNumProteinsColumn : TURE(default) remove shared peptides by information of # Proteins column in PSM sheet.
  • useUniquePeptide : TRUE(default) removes peptides that are assigned for more than one proteins. We assume to use unique peptide for each protein.
  • rmPSM_withMissing_withinRun : TRUE will remove PSM with any missing value within each Run. Default is FALSE.
  • rmPSM_withfewMea_withinRun : only for rmPSM_withMissing_withinRun = FALSE. TRUE(default) will remove the features that have 1 or 2 measurements within each Run.
  • removeProtein_with1Peptide : TRUE will remove the proteins which have only 1 peptide and charge. Default is FALSE.
  • summaryforMultipleRows : sum(default) or max - when there are multiple measurements for certain PSM in certain run, select the PSM with the largest summation or maximal value.

Example

# read in PD PSM sheet
# raw.pd <- read.delim("161117_SILAC_HeLa_UPS1_TMT10_5Mixtures_3TechRep_UPSdB_Multiconsensus_PD22_Intensity_PSMs.txt")
head(raw.pd)
#>    Checked Confidence Identifying.Node PSM.Ambiguity
#> 1:   FALSE       High      Mascot (O4)   Unambiguous
#> 2:   FALSE       High      Mascot (K2)   Unambiguous
#> 3:   FALSE       High      Mascot (K2)   Unambiguous
#> 4:   FALSE       High      Mascot (F2)      Selected
#> 5:   FALSE       High      Mascot (K2)   Unambiguous
#> 6:   FALSE       High      Mascot (K2)   Unambiguous
#>                        Annotated.Sequence
#> 1: [K].gFQQILAGEYDHLPEQAFYMVGPIEEAVAk.[A]
#> 2:          [R].qYPWGVAEVENGEHcDFTILr.[N]
#> 3:              [R].dkPSVEPVEEYDYEDLk.[E]
#> 4:                      [R].hEHQVMLmr.[Q]
#> 5:       [R].dNLTLWTADNAGEEGGEAPQEPQS.[-]
#> 6:         [R].aLVAIGTHDLDTLSGPFTYTAk.[R]
#>                                                      Modifications
#> 1:                                 N-Term(TMT6plex); K30(TMT6plex)
#> 2: N-Term(TMT6plex); C15(Carbamidomethyl); R21(Label:13C(6)15N(4))
#> 3:                         N-Term(TMT6plex); K2(Label); K17(Label)
#> 4:         N-Term(TMT6plex); M8(Oxidation); R9(Label:13C(6)15N(4))
#> 5:                                                N-Term(TMT6plex)
#> 6:                                    N-Term(TMT6plex); K22(Label)
#>    Marked.as X..Protein.Groups X..Proteins Master.Protein.Accessions
#> 1:        NA                 1           1                    P06576
#> 2:        NA                 1           1                    Q16181
#> 3:        NA                 1           1                    Q9Y450
#> 4:        NA                 1           1                    Q15233
#> 5:        NA                 1           1                    P31947
#> 6:        NA                 1           1                    Q9NSD9
#>                                                            Master.Protein.Descriptions
#> 1:         ATP synthase subunit beta, mitochondrial OS=Homo sapiens GN=ATP5B PE=1 SV=3
#> 2:                                         Septin-7 OS=Homo sapiens GN=SEPT7 PE=1 SV=2
#> 3:                                HBS1-like protein OS=Homo sapiens GN=HBS1L PE=1 SV=1
#> 4: Non-POU domain-containing octamer-binding protein OS=Homo sapiens GN=NONO PE=1 SV=4
#> 5:                               14-3-3 protein sigma OS=Homo sapiens GN=SFN PE=1 SV=1
#> 6:          Phenylalanine--tRNA ligase beta subunit OS=Homo sapiens GN=FARSB PE=1 SV=3
#>    Protein.Accessions
#> 1:             P06576
#> 2:             Q16181
#> 3:             Q9Y450
#> 4:             Q15233
#> 5:             P31947
#> 6:             Q9NSD9
#>                                                                   Protein.Descriptions
#> 1:         ATP synthase subunit beta, mitochondrial OS=Homo sapiens GN=ATP5B PE=1 SV=3
#> 2:                                         Septin-7 OS=Homo sapiens GN=SEPT7 PE=1 SV=2
#> 3:                                HBS1-like protein OS=Homo sapiens GN=HBS1L PE=1 SV=1
#> 4: Non-POU domain-containing octamer-binding protein OS=Homo sapiens GN=NONO PE=1 SV=4
#> 5:                               14-3-3 protein sigma OS=Homo sapiens GN=SFN PE=1 SV=1
#> 6:          Phenylalanine--tRNA ligase beta subunit OS=Homo sapiens GN=FARSB PE=1 SV=3
#>    X..Missed.Cleavages Charge DeltaScore DeltaCn Rank Search.Engine.Rank
#> 1:                   0      3     1.0000       0    1                  1
#> 2:                   0      3     1.0000       0    1                  1
#> 3:                   1      3     0.9730       0    1                  1
#> 4:                   0      4     0.5250       0    1                  1
#> 5:                   0      3     1.0000       0    1                  1
#> 6:                   0      3     0.9783       0    1                  1
#>     m.z..Da. MH...Da. Theo..MH...Da. DeltaM..ppm. Deltam.z..Da.
#> 1: 1270.3249 3808.960       3808.966        -1.51      -0.00192
#> 2:  920.4493 2759.333       2759.332         0.31       0.00028
#> 3:  920.1605 2758.467       2758.461         2.08       0.00192
#> 4:  359.6898 1435.737       1435.738        -0.04      -0.00002
#> 5:  920.0943 2758.268       2758.264         1.53       0.00141
#> 6:  919.8502 2757.536       2757.532         1.48       0.00136
#>    Activation.Type MS.Order Isolation.Interference....
#> 1:             CID      MS2                  47.955590
#> 2:             CID      MS2                   9.377507
#> 3:             CID      MS2                  38.317050
#> 4:             CID      MS2                  21.390040
#> 5:             CID      MS2                   0.000000
#> 6:             CID      MS2                  30.619960
#>    Average.Reporter.S.N Ion.Inject.Time..ms. RT..min. First.Scan
#> 1:                  8.7               50.000 212.2487     112815
#> 2:                  8.1                3.242 164.7507      87392
#> 3:                 17.8               13.596 143.4534      74786
#> 4:                 36.5               50.000  21.6426       6458
#> 5:                 16.7                6.723 174.1863      92950
#> 6:                 26.7                8.958 176.4863      94294
#>                                   Spectrum.File File.ID Abundance..126
#> 1: 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_03.raw      F1       2548.326
#> 2: 161117_SILAC_HeLa_UPS1_TMT10_Mixture3_03.raw      F5      22861.765
#> 3: 161117_SILAC_HeLa_UPS1_TMT10_Mixture3_03.raw      F5      25504.083
#> 4: 161117_SILAC_HeLa_UPS1_TMT10_Mixture4_02.raw     F10      13493.228
#> 5: 161117_SILAC_HeLa_UPS1_TMT10_Mixture3_03.raw      F5      64582.786
#> 6: 161117_SILAC_HeLa_UPS1_TMT10_Mixture3_03.raw      F5      35404.709
#>    Abundance..127N Abundance..127C Abundance..128N Abundance..128C
#> 1:        3231.929        2760.839        4111.639        3127.254
#> 2:       25817.946       23349.498       29449.609       25995.929
#> 3:       27740.450       25144.974       25754.579       29923.176
#> 4:       14674.490       11187.900       12831.495       13839.426
#> 5:       50576.417       47126.037       56285.129       46257.310
#> 6:       31905.852       30993.941       36854.351       37506.001
#>    Abundance..129N Abundance..129C Abundance..130N Abundance..130C
#> 1:        1874.163        2831.423        2298.401        3798.876
#> 2:       22955.769       30578.971       30660.488       38728.853
#> 3:       34097.637       31650.255       27632.692       23886.881
#> 4:       12441.353       13450.885       14777.844       13039.995
#> 5:       52634.885       49716.850       60660.574       55830.488
#> 6:       25703.444       38626.598       35447.942       33788.409
#>    Abundance..131 Quan.Info Ions.Score Identity.Strict Identity.Relaxed
#> 1:       3739.067        NA         90              28               21
#> 2:      25047.280        NA         76              24               17
#> 3:      35331.092        NA         74              30               23
#> 4:      12057.121        NA         40              25               18
#> 5:      40280.577        NA         38              21               14
#> 6:      32031.516        NA         46              29               22
#>    Expectation.Value Percolator.q.Value Percolator.PEP
#> 1:      7.038672e-09                  0      1.396e-05
#> 2:      6.298627e-08                  0      3.349e-07
#> 3:      4.318385e-07                  0      9.922e-07
#> 4:      3.351211e-04                  0      1.175e-04
#> 5:      2.152501e-04                  0      1.383e-05
#> 6:      2.060469e-04                  0      7.198e-05

# Read in annotation including condition and biological replicates per run and channel.
# Users should make this annotation file. It is not the output from Proteome Discoverer.
# annotation.pd <- read.csv(file="PD_Annotation.csv", header=TRUE)
head(annotation.pd)
#>                                            Run Channel Condition  Mixture
#> 1 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw     126      Norm Mixture1
#> 2 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw    127N     0.667 Mixture1
#> 3 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw    127C     0.125 Mixture1
#> 4 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw    128N       0.5 Mixture1
#> 5 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw    128C         1 Mixture1
#> 6 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw    129N     0.125 Mixture1
#>   BioReplicate
#> 1       1.X126
#> 2     1.X127_N
#> 3     1.X127_C
#> 4     1.X128_N
#> 5     1.X128_C
#> 6     1.X129_N

# do not remove PSM with missing values within one run
input.pd <- PDtoMSstatsTMTFormat(raw.pd, annotation.pd)
#> ** Shared PSMs (assigned in multiple proteins) are removed.
#> ** 55 features have 1 or 2 intensities across runs and are removed.
#> ** Multiple measurements in a feature and a run are summarized by summaryforMultipleRows.
head(input.pd)
#>   ProteinName               PeptideSequence Charge
#> 1      P04406        [K].lISWYDNEFGYSNR.[V]      2
#> 2      Q9NSD9           [K].irPFAVAAVLr.[N]      3
#> 3      P04406    [K].lVINGNPITIFQErDPSk.[I]      3
#> 4      P04406          [R].vVDLmAHMASkE.[-]      3
#> 5      P06576      [R].dQEGQDVLLFIDNIFR.[F]      3
#> 6      P06576 [R].iPSAVGYQPTLATDMGTMQEr.[I]      3
#>                               PSM Channel Condition BioReplicate
#> 1        [K].lISWYDNEFGYSNR.[V]_2     126      Norm       1.X126
#> 2           [K].irPFAVAAVLr.[N]_3     126      Norm       1.X126
#> 3    [K].lVINGNPITIFQErDPSk.[I]_3     126      Norm       1.X126
#> 4          [R].vVDLmAHMASkE.[-]_3     126      Norm       1.X126
#> 5      [R].dQEGQDVLLFIDNIFR.[F]_3     126      Norm       1.X126
#> 6 [R].iPSAVGYQPTLATDMGTMQEr.[I]_3     126      Norm       1.X126
#>                                            Run  Mixture   Intensity
#> 1 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw Mixture1    8348.351
#> 2 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw Mixture1   28327.492
#> 3 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw Mixture1 1275010.965
#> 4 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw Mixture1   80589.877
#> 5 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw Mixture1    2231.389
#> 6 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw Mixture1  144854.307

# remove PSM with missing values within one run
input.pd.no.miss <- PDtoMSstatsTMTFormat(raw.pd, annotation.pd,
                                 rmPSM_withMissing_withinRun = TRUE)
#> ** Shared PSMs (assigned in multiple proteins) are removed.
#> ** Rows which has any missing value within a run were removed from that run.
#> ** 0 features have 1 or 2 intensities across runs and are removed.
#> ** Multiple measurements in a feature and a run are summarized by summaryforMultipleRows.
head(input.pd.no.miss)
#>   ProteinName          PeptideSequence Charge                        PSM
#> 1      P12277 [K].lAVEALSSLDGDLAGr.[Y]      3 [K].lAVEALSSLDGDLAGr.[Y]_3
#> 2      P04406   [K].lVINGNPITIFQEr.[D]      3   [K].lVINGNPITIFQEr.[D]_3
#> 3      Q16181     [K].dVTNNVHYENYr.[S]      3     [K].dVTNNVHYENYr.[S]_3
#> 4      P04406         [K].qASEGPLk.[G]      2         [K].qASEGPLk.[G]_2
#> 5      Q15233         [R].rQQEEMMr.[R]      3         [R].rQQEEMMr.[R]_3
#> 6      P06576 [R].dQEGQDVLLFIDNIFr.[F]      3 [R].dQEGQDVLLFIDNIFr.[F]_3
#>   Channel Condition BioReplicate
#> 1     126      Norm       1.X126
#> 2     126      Norm       1.X126
#> 3     126      Norm       1.X126
#> 4     126      Norm       1.X126
#> 5     126      Norm       1.X126
#> 6     126      Norm       1.X126
#>                                            Run  Mixture  Intensity
#> 1 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw Mixture1  23037.057
#> 2 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw Mixture1 349661.432
#> 3 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw Mixture1  40699.454
#> 4 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw Mixture1  13882.684
#> 5 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw Mixture1   9302.419
#> 6 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw Mixture1  12261.325

MaxQtoMSstatsTMTFormat()

Preprocess PSM-level data from MaxQuant and convert into the required input format for MSstatsTMT.

Arguments

  • evidence : name of evidence.txt data, which includes PSM-level data.
  • proteinGroups : name of proteinGroups.txt data, which contains the detailed information of protein identifications.
  • annotation : data frame which contains column Run, Channel, Condition, BioReplicate, Mixture.
  • fraction : indicates whether the data has fractions. If there are fractions, then overlapped peptide ions will be removed and then fractions are combined for each biological mixture.
  • which.proteinid : Use Proteins(default) column for protein name. Leading.proteins or Leading.razor.proteins can be used instead. However, those can potentially have the shared peptides.
  • rmProt_Only.identified.by.site : TRUE will remove proteins with ‘+’ in ‘Only.identified.by.site’ column from proteinGroups.txt, which was identified only by a modification site. FALSE is the default.
  • useUniquePeptide : TRUE(default) removes peptides that are assigned for more than one proteins. We assume to use unique peptide for each protein.
  • rmPSM_withMissing_withinRun : TRUE will remove PSM with any missing value within each Run. Default is FALSE.
  • rmPSM_withfewMea_withinRun : only for rmPSM_withMissing_withinRun = FALSE. TRUE(default) will remove the features that have 1 or 2 measurements within each Run.
  • removeProtein_with1Peptide : TRUE will remove the proteins which have only 1 peptide and charge. Default is FALSE.
  • summaryforMultipleRows : sum(default) or max - when there are multiple measurements for certain PSM in certain run, select the PSM with the largest summation or maximal value.

Example

# Read in MaxQuant files
# proteinGroups <- read.table("proteinGroups.txt", sep="\t", header=TRUE)

# evidence <- read.table("evidence.txt", sep="\t", header=TRUE)

# Users should make this annotation file. It is not the output from MaxQuant.
# annotation.mq <- read.csv(file="MQ_Annotation.csv", header=TRUE)

input.mq <- MaxQtoMSstatsTMTFormat(evidence, proteinGroups, annotation.mq)
#> ** + Contaminant, + Reverse, + Only.identified.by.site, proteins are removed.
#> ** PSMs, that have all zero intensities across channels in each run, are removed.
#> ** 2 features have 1 or 2 intensities across runs and are removed.
#> ** Multiple measurements in a feature and a run are summarized by summaryforMultipleRows.
head(input.mq)
#>   ProteinName             PeptideSequence Charge
#> 1      O15042    AAAEIYEEFLAAFEGSDGNK(ly)      3
#> 2      Q9P258           DGQILPVPNVVVR(ar)      3
#> 3      Q96P70             ICPFTIAIFLK(ly)      3
#> 4      P36578         FCIWTESAFR(ar)K(ly)      3
#> 5      Q9P258       AAAAAWEEPSSGNGTAR(ar)      2
#> 6      Q96P70 VWTANPQQFVEDEDDDTFSYTVR(ar)      3
#>                             PSM   Channel Condition BioReplicate
#> 1    AAAEIYEEFLAAFEGSDGNK(ly)_3 channel.0      Norm       1.X126
#> 2           DGQILPVPNVVVR(ar)_3 channel.0      Norm       1.X126
#> 3             ICPFTIAIFLK(ly)_3 channel.0      Norm       1.X126
#> 4         FCIWTESAFR(ar)K(ly)_3 channel.0      Norm       1.X126
#> 5       AAAAAWEEPSSGNGTAR(ar)_2 channel.0      Norm       1.X126
#> 6 VWTANPQQFVEDEDDDTFSYTVR(ar)_3 channel.0      Norm       1.X126
#>                                        Run  Mixture Intensity
#> 1 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01 Mixture1   1031.50
#> 2 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01 Mixture1   2219.20
#> 3 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01 Mixture1    478.17
#> 4 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01 Mixture1    534.43
#> 5 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01 Mixture1    866.26
#> 6 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01 Mixture1    388.78

SpectroMinetoMSstatsTMTFormat()

Preprocess PSM data from SpectroMine and convert into the required input format for MSstatsTMT.

Arguments

  • input : data name of SpectroMine PSM output. Read PSM sheet.
  • annotation : data frame which contains column Run, Channel, Condition, BioReplicate, Mixture.
  • fraction : indicates whether the data has fractions. If there are fractions, then overlapped peptide ions will be removed and then fractions are combined for each biological mixture.
  • filter_with_Qvalue : TRUE(default) will filter out the intensities that have greater than qvalue_cutoff in EG.Qvalue column. Those intensities will be replaced with NA and will be considered as censored missing values for imputation purpose.
  • qvalue_cutoff : Cutoff for EG.Qvalue. default is 0.01.
  • useUniquePeptide : TRUE(default) removes peptides that are assigned for more than one proteins. We assume to use unique peptide for each protein.
  • rmPSM_withMissing_withinRun : TRUE will remove PSM with any missing value within each Run. Default is FALSE.
  • rmPSM_withfewMea_withinRun : only for rmPSM_withMissing_withinRun = FALSE. TRUE(default) will remove the features that have 1 or 2 measurements within each Run.
  • removeProtein_with1Peptide : TRUE will remove the proteins which have only 1 peptide and charge. Default is FALSE.
  • summaryforMultipleRows : sum(default) or max - when there are multiple measurements for certain PSM in certain run, select the PSM with the largest summation or maximal value.

Example

# Read in SpectroMine PSM report
# raw.mine <- read.csv('20180831_095547_CID-OT-MS3-Short_PSM Report_20180831_103118.xls', sep="\t")

# Users should make this annotation file. It is not the output from SpectroMine
# annotation.mine <- read.csv(file="Mine_Annotation.csv", header=TRUE)

input.mine <- SpectroMinetoMSstatsTMTFormat(raw.mine, annotation.mine)
#> ** Intensities with great than 0.01 in PG.QValue are replaced with NA.
#> ** Intensities with great than 0.01 in EG.Qvalue are replaced with NA.
#> ** 0 rows have all NAs are removed.
#> ** All peptides are unique peptides in proteins.
#> ** 0 features have 1 or 2 intensities across runs and are removed.
#> ** Multiple measurements in a feature and a run are summarized by summaryforMultipleRows.
head(input.mine)
#>   ProteinName                               PeptideSequence Charge
#> 1      Q9GZT9 _[TMT_Nter]AAAGGQGSAVAAEAEPGK[TMT_Lys]EEPPAR_      3
#> 2      Q9NVA2    _[TMT_Nter]K[TMT_Lys]ELEEEVNNFQK[TMT_Lys]_      3
#> 3      Q9NVA2                 _[TMT_Nter]SLDLVTMK[TMT_Lys]_      2
#> 4      Q9NVA2      _[TMT_Nter]AAAQLLQSQAQQSGAQQTK[TMT_Lys]_      3
#> 5      Q9NVA2      _[TMT_Nter]AAAQLLQSQAQQSGAQQTK[TMT_Lys]_      2
#> 6      Q99733                          _[TMT_Nter]VLAALQER_      2
#>                                               PSM  Channel Condition
#> 1 _[TMT_Nter]AAAGGQGSAVAAEAEPGK[TMT_Lys]EEPPAR__3 TMT6_126         3
#> 2    _[TMT_Nter]K[TMT_Lys]ELEEEVNNFQK[TMT_Lys]__3 TMT6_126         3
#> 3                 _[TMT_Nter]SLDLVTMK[TMT_Lys]__2 TMT6_126         3
#> 4      _[TMT_Nter]AAAQLLQSQAQQSGAQQTK[TMT_Lys]__3 TMT6_126         3
#> 5      _[TMT_Nter]AAAQLLQSQAQQSGAQQTK[TMT_Lys]__2 TMT6_126         3
#> 6                          _[TMT_Nter]VLAALQER__2 TMT6_126         3
#>   BioReplicate                                                   Run
#> 1          1_1 ch_19Jan2017_SM-1-1_Sp-6-2_CID-OT-MS3-Short_HpH_1.raw
#> 2          1_1 ch_19Jan2017_SM-1-1_Sp-6-2_CID-OT-MS3-Short_HpH_1.raw
#> 3          1_1 ch_19Jan2017_SM-1-1_Sp-6-2_CID-OT-MS3-Short_HpH_1.raw
#> 4          1_1 ch_19Jan2017_SM-1-1_Sp-6-2_CID-OT-MS3-Short_HpH_1.raw
#> 5          1_1 ch_19Jan2017_SM-1-1_Sp-6-2_CID-OT-MS3-Short_HpH_1.raw
#> 6          1_1 ch_19Jan2017_SM-1-1_Sp-6-2_CID-OT-MS3-Short_HpH_1.raw
#>   Mixture  Intensity
#> 1       1   382.1107
#> 2       1 33554.1900
#> 3       1 44713.6300
#> 4       1 20877.8700
#> 5       1   506.1669
#> 6       1 17143.5200

2. Protein summarization, normalization and visualization

2.1. proteinSummarization()

Protein-level summarization from PSM level quantification should be performed before testing differentially abundant proteins. Then, normalization between MS runs using normalization channels will be implemented. In particular, protein summarization method MSstats assumes missing values are censored and then imputes the missing values before summarizing PSM level data into protein level data. Other methods, including MedianPolish, Median and LogSum, do not impute missing values.

Arguments

  • data : Name of the output of PDtoMSstatsTMTFormat function or PSM-level quantified data from other tools. It should have columns named Protein, PSM, Mixture, Run, Channel, Condition, BioReplicate, Intensity.
  • method : Four different summarization methods to protein-level can be performed : msstats(default), MedianPolish, Median, LogSum.
  • normalization : Normalization between MS runs. TRUE(default) needs at least normalization channel in each MS run, annotated by Norm in Group column. It will be performed after protein-level summarization. FALSE will not perform normalization step.
  • MBimpute : only for method = "msstats". TRUE (default) imputes missing values by Accelated failure model. FALSE uses minimum value to impute the missing value for each PSM.
  • maxQuantileforCensored : We assume missing values are censored. maxQuantileforCensored is Maximum quantile for deciding censored missing value, for instance, 0.999. Default is Null.

Example

# use MSstats for protein summarization
quant.msstats <- proteinSummarization(input.pd,
                                       method="msstats",
                                       normalization=TRUE)
#>                        
#>   Summary of Features :
#>                          count
#> # of Protein                10
#> # of Peptides/Protein     4-29
#> # of Transitions/Peptide   1-1
#>                       
#>   Summary of Samples :
#>                            0.125 0.5 0.667 1 Norm
#> # of MS runs                   2   2     2 2    2
#> # of Biological Replicates     2   2     2 2    2
#> # of Technical Replicates      1   1     1 1    1
#> 
  |                                                                       
  |                                                                 |   0%
  |                                                                       
  |======                                                           |  10%
  |                                                                       
  |=============                                                    |  20%
  |                                                                       
  |====================                                             |  30%
  |                                                                       
  |==========================                                       |  40%
  |                                                                       
  |================================                                 |  50%
  |                                                                       
  |=======================================                          |  60%
  |                                                                       
  |==============================================                   |  70%
  |                                                                       
  |====================================================             |  80%
  |                                                                       
  |==========================================================       |  90%
  |                                                                       
  |=================================================================| 100%
#>                        
#>   Summary of Features :
#>                          count
#> # of Protein                10
#> # of Peptides/Protein     3-33
#> # of Transitions/Peptide   1-1
#>                       
#>   Summary of Samples :
#>                            0.125 0.5 0.667 1 Norm
#> # of MS runs                   2   2     2 2    2
#> # of Biological Replicates     2   2     2 2    2
#> # of Technical Replicates      1   1     1 1    1
#> 
  |                                                                       
  |                                                                 |   0%
  |                                                                       
  |======                                                           |  10%
  |                                                                       
  |=============                                                    |  20%
  |                                                                       
  |====================                                             |  30%
  |                                                                       
  |==========================                                       |  40%
  |                                                                       
  |================================                                 |  50%
  |                                                                       
  |=======================================                          |  60%
  |                                                                       
  |==============================================                   |  70%
  |                                                                       
  |====================================================             |  80%
  |                                                                       
  |==========================================================       |  90%
  |                                                                       
  |=================================================================| 100%
#>                        
#>   Summary of Features :
#>                          count
#> # of Protein                10
#> # of Peptides/Protein     3-29
#> # of Transitions/Peptide   1-1
#>                       
#>   Summary of Samples :
#>                            0.125 0.5 0.667 1 Norm
#> # of MS runs                   2   2     2 2    2
#> # of Biological Replicates     2   2     2 2    2
#> # of Technical Replicates      1   1     1 1    1
#> 
  |                                                                       
  |                                                                 |   0%
  |                                                                       
  |======                                                           |  10%
  |                                                                       
  |=============                                                    |  20%
  |                                                                       
  |====================                                             |  30%
  |                                                                       
  |==========================                                       |  40%
  |                                                                       
  |================================                                 |  50%
  |                                                                       
  |=======================================                          |  60%
  |                                                                       
  |==============================================                   |  70%
  |                                                                       
  |====================================================             |  80%
  |                                                                       
  |==========================================================       |  90%
  |                                                                       
  |=================================================================| 100%
#>                        
#>   Summary of Features :
#>                          count
#> # of Protein                10
#> # of Peptides/Protein     1-28
#> # of Transitions/Peptide   1-1
#>                       
#>   Summary of Samples :
#>                            0.125 0.5 0.667 1 Norm
#> # of MS runs                   2   2     2 2    2
#> # of Biological Replicates     2   2     2 2    2
#> # of Technical Replicates      1   1     1 1    1
#> 
  |                                                                       
  |                                                                 |   0%
  |                                                                       
  |======                                                           |  10%
  |                                                                       
  |=============                                                    |  20%
  |                                                                       
  |====================                                             |  30%
  |                                                                       
  |==========================                                       |  40%
  |                                                                       
  |================================                                 |  50%
  |                                                                       
  |=======================================                          |  60%
  |                                                                       
  |==============================================                   |  70%
  |                                                                       
  |====================================================             |  80%
  |                                                                       
  |==========================================================       |  90%
  |                                                                       
  |=================================================================| 100%
#>                        
#>   Summary of Features :
#>                          count
#> # of Protein                10
#> # of Peptides/Protein     1-30
#> # of Transitions/Peptide   1-1
#>                       
#>   Summary of Samples :
#>                            0.125 0.5 0.667 1 Norm
#> # of MS runs                   2   2     2 2    2
#> # of Biological Replicates     2   2     2 2    2
#> # of Technical Replicates      1   1     1 1    1
#> 
  |                                                                       
  |                                                                 |   0%
  |                                                                       
  |======                                                           |  10%
  |                                                                       
  |=============                                                    |  20%
  |                                                                       
  |====================                                             |  30%
  |                                                                       
  |==========================                                       |  40%
  |                                                                       
  |================================                                 |  50%
  |                                                                       
  |=======================================                          |  60%
  |                                                                       
  |==============================================                   |  70%
  |                                                                       
  |====================================================             |  80%
  |                                                                       
  |==========================================================       |  90%
  |                                                                       
  |=================================================================| 100%
#>                        
#>   Summary of Features :
#>                          count
#> # of Protein                10
#> # of Peptides/Protein     2-30
#> # of Transitions/Peptide   1-1
#>                       
#>   Summary of Samples :
#>                            0.125 0.5 0.667 1 Norm
#> # of MS runs                   2   2     2 2    2
#> # of Biological Replicates     2   2     2 2    2
#> # of Technical Replicates      1   1     1 1    1
#> 
  |                                                                       
  |                                                                 |   0%
  |                                                                       
  |======                                                           |  10%
  |                                                                       
  |=============                                                    |  20%
  |                                                                       
  |====================                                             |  30%
  |                                                                       
  |==========================                                       |  40%
  |                                                                       
  |================================                                 |  50%
  |                                                                       
  |=======================================                          |  60%
  |                                                                       
  |==============================================                   |  70%
  |                                                                       
  |====================================================             |  80%
  |                                                                       
  |==========================================================       |  90%
  |                                                                       
  |=================================================================| 100%
#>                        
#>   Summary of Features :
#>                          count
#> # of Protein                10
#> # of Peptides/Protein     4-31
#> # of Transitions/Peptide   1-1
#>                       
#>   Summary of Samples :
#>                            0.125 0.5 0.667 1 Norm
#> # of MS runs                   2   2     2 2    2
#> # of Biological Replicates     2   2     2 2    2
#> # of Technical Replicates      1   1     1 1    1
#> 
  |                                                                       
  |                                                                 |   0%
  |                                                                       
  |======                                                           |  10%
  |                                                                       
  |=============                                                    |  20%
  |                                                                       
  |====================                                             |  30%
  |                                                                       
  |==========================                                       |  40%
  |                                                                       
  |================================                                 |  50%
  |                                                                       
  |=======================================                          |  60%
  |                                                                       
  |==============================================                   |  70%
  |                                                                       
  |====================================================             |  80%
  |                                                                       
  |==========================================================       |  90%
  |                                                                       
  |=================================================================| 100%
#>                        
#>   Summary of Features :
#>                          count
#> # of Protein                10
#> # of Peptides/Protein     3-30
#> # of Transitions/Peptide   1-1
#>                       
#>   Summary of Samples :
#>                            0.125 0.5 0.667 1 Norm
#> # of MS runs                   2   2     2 2    2
#> # of Biological Replicates     2   2     2 2    2
#> # of Technical Replicates      1   1     1 1    1
#> 
  |                                                                       
  |                                                                 |   0%
  |                                                                       
  |======                                                           |  10%
  |                                                                       
  |=============                                                    |  20%
  |                                                                       
  |====================                                             |  30%
  |                                                                       
  |==========================                                       |  40%
  |                                                                       
  |================================                                 |  50%
  |                                                                       
  |=======================================                          |  60%
  |                                                                       
  |==============================================                   |  70%
  |                                                                       
  |====================================================             |  80%
  |                                                                       
  |==========================================================       |  90%
  |                                                                       
  |=================================================================| 100%
#>                        
#>   Summary of Features :
#>                          count
#> # of Protein                10
#> # of Peptides/Protein     5-30
#> # of Transitions/Peptide   1-1
#>                       
#>   Summary of Samples :
#>                            0.125 0.5 0.667 1 Norm
#> # of MS runs                   2   2     2 2    2
#> # of Biological Replicates     2   2     2 2    2
#> # of Technical Replicates      1   1     1 1    1
#> 
  |                                                                       
  |                                                                 |   0%
  |                                                                       
  |======                                                           |  10%
  |                                                                       
  |=============                                                    |  20%
  |                                                                       
  |====================                                             |  30%
  |                                                                       
  |==========================                                       |  40%
  |                                                                       
  |================================                                 |  50%
  |                                                                       
  |=======================================                          |  60%
  |                                                                       
  |==============================================                   |  70%
  |                                                                       
  |====================================================             |  80%
  |                                                                       
  |==========================================================       |  90%
  |                                                                       
  |=================================================================| 100%
#>                        
#>   Summary of Features :
#>                          count
#> # of Protein                10
#> # of Peptides/Protein     3-31
#> # of Transitions/Peptide   1-1
#>                       
#>   Summary of Samples :
#>                            0.125 0.5 0.667 1 Norm
#> # of MS runs                   2   2     2 2    2
#> # of Biological Replicates     2   2     2 2    2
#> # of Technical Replicates      1   1     1 1    1
#> 
  |                                                                       
  |                                                                 |   0%
  |                                                                       
  |======                                                           |  10%
  |                                                                       
  |=============                                                    |  20%
  |                                                                       
  |====================                                             |  30%
  |                                                                       
  |==========================                                       |  40%
  |                                                                       
  |================================                                 |  50%
  |                                                                       
  |=======================================                          |  60%
  |                                                                       
  |==============================================                   |  70%
  |                                                                       
  |====================================================             |  80%
  |                                                                       
  |==========================================================       |  90%
  |                                                                       
  |=================================================================| 100%
#>                        
#>   Summary of Features :
#>                          count
#> # of Protein                10
#> # of Peptides/Protein     3-31
#> # of Transitions/Peptide   1-1
#>                       
#>   Summary of Samples :
#>                            0.125 0.5 0.667 1 Norm
#> # of MS runs                   2   2     2 2    2
#> # of Biological Replicates     2   2     2 2    2
#> # of Technical Replicates      1   1     1 1    1
#> 
  |                                                                       
  |                                                                 |   0%
  |                                                                       
  |======                                                           |  10%
  |                                                                       
  |=============                                                    |  20%
  |                                                                       
  |====================                                             |  30%
  |                                                                       
  |==========================                                       |  40%
  |                                                                       
  |================================                                 |  50%
  |                                                                       
  |=======================================                          |  60%
  |                                                                       
  |==============================================                   |  70%
  |                                                                       
  |====================================================             |  80%
  |                                                                       
  |==========================================================       |  90%
  |                                                                       
  |=================================================================| 100%
#>                        
#>   Summary of Features :
#>                          count
#> # of Protein                10
#> # of Peptides/Protein     1-31
#> # of Transitions/Peptide   1-1
#>                       
#>   Summary of Samples :
#>                            0.125 0.5 0.667 1 Norm
#> # of MS runs                   2   2     2 2    2
#> # of Biological Replicates     2   2     2 2    2
#> # of Technical Replicates      1   1     1 1    1
#> 
  |                                                                       
  |                                                                 |   0%
  |                                                                       
  |======                                                           |  10%
  |                                                                       
  |=============                                                    |  20%
  |                                                                       
  |====================                                             |  30%
  |                                                                       
  |==========================                                       |  40%
  |                                                                       
  |================================                                 |  50%
  |                                                                       
  |=======================================                          |  60%
  |                                                                       
  |==============================================                   |  70%
  |                                                                       
  |====================================================             |  80%
  |                                                                       
  |==========================================================       |  90%
  |                                                                       
  |=================================================================| 100%
#>                        
#>   Summary of Features :
#>                          count
#> # of Protein                10
#> # of Peptides/Protein     3-34
#> # of Transitions/Peptide   1-1
#>                       
#>   Summary of Samples :
#>                            0.125 0.5 0.667 1 Norm
#> # of MS runs                   2   2     2 2    2
#> # of Biological Replicates     2   2     2 2    2
#> # of Technical Replicates      1   1     1 1    1
#> 
  |                                                                       
  |                                                                 |   0%
  |                                                                       
  |======                                                           |  10%
  |                                                                       
  |=============                                                    |  20%
  |                                                                       
  |====================                                             |  30%
  |                                                                       
  |==========================                                       |  40%
  |                                                                       
  |================================                                 |  50%
  |                                                                       
  |=======================================                          |  60%
  |                                                                       
  |==============================================                   |  70%
  |                                                                       
  |====================================================             |  80%
  |                                                                       
  |==========================================================       |  90%
  |                                                                       
  |=================================================================| 100%
#>                        
#>   Summary of Features :
#>                          count
#> # of Protein                10
#> # of Peptides/Protein     2-30
#> # of Transitions/Peptide   1-1
#>                       
#>   Summary of Samples :
#>                            0.125 0.5 0.667 1 Norm
#> # of MS runs                   2   2     2 2    2
#> # of Biological Replicates     2   2     2 2    2
#> # of Technical Replicates      1   1     1 1    1
#> 
  |                                                                       
  |                                                                 |   0%
  |                                                                       
  |======                                                           |  10%
  |                                                                       
  |=============                                                    |  20%
  |                                                                       
  |====================                                             |  30%
  |                                                                       
  |==========================                                       |  40%
  |                                                                       
  |================================                                 |  50%
  |                                                                       
  |=======================================                          |  60%
  |                                                                       
  |==============================================                   |  70%
  |                                                                       
  |====================================================             |  80%
  |                                                                       
  |==========================================================       |  90%
  |                                                                       
  |=================================================================| 100%
#>                        
#>   Summary of Features :
#>                          count
#> # of Protein                10
#> # of Peptides/Protein     5-32
#> # of Transitions/Peptide   1-1
#>                       
#>   Summary of Samples :
#>                            0.125 0.5 0.667 1 Norm
#> # of MS runs                   2   2     2 2    2
#> # of Biological Replicates     2   2     2 2    2
#> # of Technical Replicates      1   1     1 1    1
#> 
  |                                                                       
  |                                                                 |   0%
  |                                                                       
  |======                                                           |  10%
  |                                                                       
  |=============                                                    |  20%
  |                                                                       
  |====================                                             |  30%
  |                                                                       
  |==========================                                       |  40%
  |                                                                       
  |================================                                 |  50%
  |                                                                       
  |=======================================                          |  60%
  |                                                                       
  |==============================================                   |  70%
  |                                                                       
  |====================================================             |  80%
  |                                                                       
  |==========================================================       |  90%
  |                                                                       
  |=================================================================| 100%
head(quant.msstats)
#>                                             Run Protein Abundance Channel
#> 1: 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw  P04406  16.69137    127C
#> 2: 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw  P04406  16.59216    129N
#> 3: 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw  P04406  16.73246    128N
#> 4: 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw  P04406  16.76139    129C
#> 5: 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw  P04406  16.61366    127N
#> 6: 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw  P04406  16.61832    130C
#>    BioReplicate Condition  Mixture
#> 1:     1.X127_C     0.125 Mixture1
#> 2:     1.X129_N     0.125 Mixture1
#> 3:     1.X128_N       0.5 Mixture1
#> 4:     1.X129_C       0.5 Mixture1
#> 5:     1.X127_N     0.667 Mixture1
#> 6:     1.X130_C     0.667 Mixture1

# use Median for protein summarization
# since median method doesn't impute missing values, 
# we need to use the input data without missing values
quant.median <- proteinSummarization(input.pd.no.miss,
                                       method="Median",
                                       normalization=TRUE)
head(quant.median)
#>                                             Run Protein Abundance Channel
#> 1: 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw  P12277  15.52767     126
#> 2: 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw  P12277  15.46150    127C
#> 3: 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw  P12277  15.59290    127N
#> 4: 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw  P12277  15.48790    128C
#> 5: 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw  P12277  15.82771    128N
#> 6: 161117_SILAC_HeLa_UPS1_TMT10_Mixture1_01.raw  P12277  15.62461    129C
#>    BioReplicate Condition  Mixture
#> 1:       1.X126      Norm Mixture1
#> 2:     1.X127_C     0.125 Mixture1
#> 3:     1.X127_N     0.667 Mixture1
#> 4:     1.X128_C         1 Mixture1
#> 5:     1.X128_N       0.5 Mixture1
#> 6:     1.X129_C       0.5 Mixture1

2.2 dataProcessPlotsTMT()

Visualization for explanatory data analysis. To illustrate the quantitative data after data-preprocessing and quality control of TMT runs, dataProcessPlotsTMT takes the quantitative data from converter functions (PDtoMSstatsTMTFormat, MQtoMSstatsTMTFormat and SpectroMinetoMSstatsTMTFormat) and summarized data from function proteinSummarization as input. It generates two types of figures in pdf files as output :

  1. profile plot (specify “ProfilePlot” in option type), to identify the potential sources of variation for each protein;

  2. quality control plot (specify “QCPlot” in option type), to evaluate the systematic bias between MS runs.

Arguments

  • data.psm : name of the data with PSM-level, which can be the output of converter functions (PDtoMSstatsTMTFormat, MQtoMSstatsTMTFormat and SpectroMinetoMSstatsTMTFormat).
  • data.summarization : name of the data with protein-level, which can be the output of proteinSummarization function.
  • type : choice of visualization. “ProfilePlot” represents profile plot of log intensities across MS runs. “QCPlot” represents quality control plot of log intensities across MS runs.
  • ylimUp : upper limit for y-axis in the log scale. FALSE(Default) for Profile Plot and QC Plot use the upper limit as rounded off maximum of log2(intensities) after normalization + 3.
  • ylimDown : lower limit for y-axis in the log scale. FALSE(Default) for Profile Plot and QC Plot is 0.
  • x.axis.size : size of x-axis labeling for “Run” and “channel” in Profile Plot and QC Plot.
  • y.axis.size : size of y-axis labels. Default is 10.
  • text.size : size of labels represented each condition at the top of graph in Profile Plot and QC plot. Default is 4.
  • text.angle : angle of labels represented each condition at the top of graph in Profile Plot and QC plot. Default is 0.
  • legend.size : size of legend above graph in Profile Plot. Default is 7.
  • dot.size.profile : size of dots in profile plot. Default is 2.
  • ncol.guide : number of columns for legends at the top of plot. Default is 5.
  • width : width of the saved file. Default is 10.
  • height : height of the saved file. Default is 10.
  • which.Protein : Protein list to draw plots. List can be names of Proteins or order numbers of Proteins. Default is “all”, which generates all plots for each protein. For QC plot, “allonly” will generate one QC plot with all proteins.
  • originalPlot : TRUE(default) draws original profile plots, without normalization.
  • summaryPlot : TRUE(default) draws profile plots with protein summarization for each channel and MS run.
  • address : the name of folder that will store the results. Default folder is the current working directory. The other assigned folder has to be existed under the current working directory. An output pdf file is automatically created with the default name of “ProfilePlot.pdf” or “QCplot.pdf”. The command address can help to specify where to store the file as well as how to modify the beginning of the file name. If address=FALSE, plot will be not saved as pdf file but showed in window.

Example

## Profile plot
dataProcessPlotsTMT(data.psm = input.pd,
                     data.summarization = quant.msstats,
                     type = 'ProfilePlot',
                     width = 21, # adjust the figure width since there are 15 TMT runs. 
                     height = 7)
#> Warning: Removed 20 rows containing missing values (geom_point).
#> Drew the Profile plot for  P04406 ( 1  of  10 )
#> Warning: Removed 36 rows containing missing values (geom_point).
#> Warning: Removed 2 rows containing missing values (geom_path).
#> Drew the Profile plot for  P06576 ( 2  of  10 )
#> Warning: Removed 29 rows containing missing values (geom_point).
#> Warning: Removed 1 rows containing missing values (geom_path).
#> Drew the Profile plot for  P12277 ( 3  of  10 )
#> Warning: Removed 3 rows containing missing values (geom_point).
#> Drew the Profile plot for  P23919 ( 4  of  10 )
#> Drew the Profile plot for  P31947 ( 5  of  10 )
#> Warning: Removed 68 rows containing missing values (geom_point).
#> Warning: Removed 6 rows containing missing values (geom_path).
#> Drew the Profile plot for  Q15233 ( 6  of  10 )
#> Warning: Removed 3 rows containing missing values (geom_point).
#> Drew the Profile plot for  Q16181 ( 7  of  10 )
#> Warning: Removed 2 rows containing missing values (geom_point).
#> Drew the Profile plot for  Q9NSD9 ( 8  of  10 )
#> Warning: Removed 13 rows containing missing values (geom_point).
#> Warning: Removed 2 rows containing missing values (geom_path).
#> Drew the Profile plot for  Q9UGP8 ( 9  of  10 )
#> Warning: Removed 6 rows containing missing values (geom_point).
#> Drew the Profile plot for  Q9Y450 ( 10  of  10 )
#> Warning: Removed 20 rows containing missing values (geom_point).
#> Warning: Removed 20 rows containing missing values (geom_point).
#> Drew the Profile plot with summarization for  P04406 ( 1  of  10 )
#> Warning: Removed 36 rows containing missing values (geom_point).
#> Warning: Removed 2 rows containing missing values (geom_path).
#> Warning: Removed 36 rows containing missing values (geom_point).
#> Drew the Profile plot with summarization for  P06576 ( 2  of  10 )
#> Warning: Removed 29 rows containing missing values (geom_point).
#> Warning: Removed 1 rows containing missing values (geom_path).
#> Warning: Removed 29 rows containing missing values (geom_point).
#> Drew the Profile plot with summarization for  P12277 ( 3  of  10 )
#> Warning: Removed 3 rows containing missing values (geom_point).
#> Warning: Removed 3 rows containing missing values (geom_point).
#> Drew the Profile plot with summarization for  P23919 ( 4  of  10 )
#> Drew the Profile plot with summarization for  P31947 ( 5  of  10 )
#> Warning: Removed 68 rows containing missing values (geom_point).
#> Warning: Removed 6 rows containing missing values (geom_path).
#> Warning: Removed 68 rows containing missing values (geom_point).
#> Drew the Profile plot with summarization for  Q15233 ( 6  of  10 )
#> Warning: Removed 3 rows containing missing values (geom_point).
#> Warning: Removed 3 rows containing missing values (geom_point).
#> Drew the Profile plot with summarization for  Q16181 ( 7  of  10 )
#> Warning: Removed 2 rows containing missing values (geom_point).
#> Warning: Removed 2 rows containing missing values (geom_point).
#> Drew the Profile plot with summarization for  Q9NSD9 ( 8  of  10 )
#> Warning: Removed 13 rows containing missing values (geom_point).
#> Warning: Removed 2 rows containing missing values (geom_path).
#> Warning: Removed 13 rows containing missing values (geom_point).
#> Drew the Profile plot with summarization for  Q9UGP8 ( 9  of  10 )
#> Warning: Removed 6 rows containing missing values (geom_point).
#> Warning: Removed 6 rows containing missing values (geom_point).
#> Drew the Profile plot with summarization for  Q9Y450 ( 10  of  10 )

## Quality control plot 
# dataProcessPlotsTMT(data.psm=input.pd,
                     # data.summarization=quant.msstats, 
                     # type='QCPlot',
                     # width = 21, # adjust the figure width since there are 15 TMT runs. 
                     # height = 7)

3. groupComparisonTMT()

Tests for significant changes in protein abundance across conditions based on a family of linear mixed-effects models in TMT experiment. Experimental design of case-control study (patients are not repeatedly measured) is automatically determined based on proper statistical model.

Arguments

Example

# test for all the possible pairs of conditions
test.pairwise <- groupComparisonTMT(quant.msstats)
head(test.pairwise)
#>   Protein       Label        log2FC        SE      DF    pvalue adj.pvalue
#> 1  P04406   0.125-0.5 -0.0068040406 0.0161061 117.212 0.6734696  0.9466283
#> 2  P04406 0.125-0.667  0.0054529852 0.0161061 117.212 0.7355420  0.7355420
#> 3  P04406     0.125-1 -0.0071581834 0.0161061 117.212 0.6575445  0.8219306
#> 4  P04406   0.5-0.667  0.0122570258 0.0161061 117.212 0.4481747  0.6402496
#> 5  P04406       0.5-1 -0.0003541428 0.0161061 117.212 0.9824948  0.9824948
#> 6  P04406     0.667-1 -0.0126111686 0.0161061 117.212 0.4352028  0.8434770

# Only compare condition 0.125 and 1
levels(quant.msstats$Condition)
#> [1] "0.125" "0.5"   "0.667" "1"     "Norm"
# 'Norm' should be not considered in the contrast
comparison<-matrix(c(-1,0,0,1),nrow=1)
# Set the names of each row
row.names(comparison)<-"1-0.125"
# Set the column names
colnames(comparison)<- c("0.125", "0.5", "0.667", "1")
comparison
#>         0.125 0.5 0.667 1
#> 1-0.125    -1   0     0 1

test.contrast <- groupComparisonTMT(data = quant.msstats, contrast.matrix = comparison)
head(test.contrast)
#>   Protein   Label        log2FC         SE      DF    pvalue adj.pvalue
#> 1  P04406 1-0.125  0.0071581834 0.01610610 117.212 0.6575445  0.8219306
#> 2  P06576 1-0.125 -0.0002522861 0.01850011 117.212 0.9891428  0.9891428
#> 3  P12277 1-0.125 -0.0116588958 0.02356725 117.212 0.6217326  0.8219306
#> 4  P23919 1-0.125  0.0330289712 0.02674342 117.212 0.2192886  0.5482215
#> 5  P31947 1-0.125  0.0357329623 0.02599206 117.212 0.1718270  0.5482215
#> 6  Q15233 1-0.125  0.0113716628 0.01745710 117.212 0.5160595  0.8219306