363 – Advances in Missing Data Imputation
Bi-directionally Imputing Missing Data in Gene Microarrays
Mortaza Jamshidian
California State University, Fullerton
Amol Kumar
California State University, Fullerton
We obtained the gene microarray data from Alizadeh and Yoshimoto to compare modified imputation techniques. 10%, 20%, 25% and 30% missing data was introduced randomly into the complete portions of the data sets and after imputing we computed a normalized Frobenius norm and the correlation between the imputed data set and the complete data set. K-nearest neighbors, principal components analysis and normal distribution based imputation were considered. We sought improvements by modifying current techniques; in particular we found that imputing a microarray its transpose and taking the average of the results may yield improvements; we call this method bi-directional imputation. For methods which require a covariance matrix when there are more variables than observations we used a shrinkage estimator.