Online Program Home
My Program

Abstract Details

Activity Number: 126
Type: Contributed
Date/Time: Monday, August 1, 2016 : 8:30 AM to 10:20 AM
Sponsor: Biometrics Section
Abstract #319203 View Presentation
Title: Truncation-Based Nearest Neighbors Imputation for High-Dimensional Data with Detection Limit Thresholds
Author(s): Jasmit Shah* and Guy N. Brock and Shesh N. Rai and Aruni Bhatnagar
Companies: University of Louisville and The Ohio State University and University of Louisville and University of Louisville
Keywords: k-nearest neighbor ; missing value imputation ; metabolomics ; truncated normal

High throughput technology makes it possible to monitor metabolites on different experiments and has been widely used to detect differences in metabolites in many areas of biomedical research. Mass spectrometry has become one of the main analytical techniques for profiling a wide array of compounds in the biological samples. Missing values in metabolomics dataset occur widely and can arise from different sources, including both technical and biological reasons. Mostly the missing value is substituted by the minimum value, and this substitute may lead to different results in the downstream analyses. In this study we propose a modified version of the K-nearest neighbor (KNN) approach which accounts for the truncation at the minimum value called KNN truncation (KNN-TN). We compare the imputation results based on KNN-TN with other KNN approaches such as KNN based on correlation (KNN-CR) and KNN based on Euclidean distance (KNN-EU). The proposed approach assumes that the data follows a truncated normal distribution with the truncation point at the detection limit (LOD). The results of KNN-TN, KNN-CR and KNN-EU were analyzed by the root mean square error (RMSE) measure.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2016 program

Copyright © American Statistical Association