Online Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 414 - Risk Modeling and Regression Techniques
Type: Contributed
Date/Time: Thursday, August 12, 2021 : 2:00 PM to 3:50 PM
Sponsor: Biometrics Section
Abstract #318439
Title: Gaussian Process Regression and Classification Using International Classification of Disease Codes as Covariates
Author(s): Sanvesh Srivastava* and Stephanie Gilbertson-White and Nick Street and Xongyi Xu and Yunyi Li
Companies: University of Iowa and The University of Iowa and The University of Iowa and The University of Iowa and University of Texas, Austin
Keywords: Bayesian learning; Gaussian process; Regression; Classification; String kernels

International Classification of Disease (ICD) codes are widely used for encoding diagnoses in electronic health records (EHR). An ICD code contains information about the diagnosis, and a collection of ICD codes defines a chronic condition. Automated methods have been developed over the years for predicting a variety of biomedical responses using the EHR, which borrow information among demographically and diagnostically similar patients. Relatively less attention has been paid to developing patient similarity measures that model the structure of ICD codes and the presence of multiple chronic conditions in addition to their primary diagnosis. Motivated by this problem, we first develop a type of string kernel function for defining similarity between a pair of subsets of ICD codes, which simultaneously uses the information about the diagnoses and related chronic conditions. Second, we extend this similarity measure to define a family of covariance functions on diagnoses encoded as subsets of ICD codes. Using a member of this family, we develop Gaussian process (GP) priors for Bayesian nonparametric regression and classification using diagnoses in the form of ICD codes as covariates.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2021 program