Abstract:
|
Motivated by the fact that in mobile health applications, a person's response to intervention may depend on his/her own circumstances at a certain moment that is difficult to capture or measure, we aims to develop personalized policies, one for each user, to optimize immediate outcomes. The proposed method can be viewed as a generalization of stochastic contextual bandits, which estimates personalized policies through random effects under a generalized linear mixed model with a group lasso type penalty to avoid overfitting of individual deviations from the population model. We examine the conditions under which the proposed method work in the presence of endogenous time-varying covariates, and provide conditional optimality and marginal consistency results of the estimated policies. We apply our method to develop personalized push (``prompt'') schedules in 294 app users, with a goal to maximize the prompt response rate given past app usage and other contextual factors.
|