Abstract:
|
Patient phenotyping is critical for analysis of electronic health records (EHR). Recently, a Bayesian phenotyping model for pediatric type 2 diabetes was proposed, which supports principled inference and uncertainty estimation, while naturally addressing missingness mechanisms. However, estimation for this model was carried out using Markov Chain Monte Carlo (MCMC), which does not scale well for large EHR databases. We address this challenge using Variational Inference. In our work, we used the previously proposed generative model. We used the Polya-Gamma augmentation and derived a Coordinate Ascent Variational Inference algorithm to perform posterior inference. Additionally, we explored Annealed Variational Inference to prevent the model from converging to a suboptimal objective due to poor initialization. We applied the proposed approach in simulations and real-world EHR data, comparing results with MCMC. VB returns comparable results to MCMC at a significantly faster computation time. This approach has potential applications across a broad range of EHR-derived phenotypes and could be extended to estimate time of onset for time-to-event outcomes.
|