Abstract:
|
The best linear unbiased predictor (BLUP) from a linear mixed model gives moderated and efficient predictions for clustered or correlated data. However, when there are outliers among individual observations and random effects, the prediction accuracy could be hampered. Also, these outliers may suggest aberrant data generating mechanism that warrants further investigation. Motivated by an RNA-Seq dataset, we illustrated a case where there are outliers among individual observations and random effects. We also showed the connection of these outliers to the underlying issues that need to be identified. As the number of genes is substantial, we proposed a scalable algorithm to achieve simultaneous linear mixed effect model estimation and automated outlier detection.
|