Online Program Home
My Program

Abstract Details

Activity Number: 428
Type: Contributed
Date/Time: Tuesday, August 2, 2016 : 2:00 PM to 3:50 PM
Sponsor: Biometrics Section
Abstract #321465 View Presentation
Title: Evaluating Imputation Methods for Integrating Proteomics Data Sets
Author(s): Yian Chen* and Kate Fisher and Eric A. Welsh and Steven Eschrich
Companies: Moffitt Cancer Center and PAREXEL and Moffitt Cancer Center and Moffitt Cancer Center
Keywords: proteomics ; integration ; missing ; imputation

High missing rate is commonly observed in proteomics experiments. Imputation is generally performed before analyses take place. We evaluated the performance of different imputation methods in the context of data integration in this study. Motivated by studying associations between the phosphoproteomics quantification and activity based protein profiling (ABPP) for kinase expression, we simulated the phosphotyrosine (pY) and ABPP datasets with different combinations of missing patterns (missing at random, at low end, or mixture), sample size, the strength of correlation. We compared the imputation using minimum value, no imputation, mean, K-nearest neighbors, probabilistic PCA, and left censored accelerated failure time (LAFT) model. Spearman correlation and LAFT were used to evaluate pairwise association. LAFT tends to have high false positive rates. When the sample size is very low and missing proportion is high, no imputation has reasonable performance. Imputation using minimum value outperforms other methods when sample size is reasonable and missing is observed at low end.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2016 program

Copyright © American Statistical Association