Abstract:
|
In many scientific problems, the goal is to make inference on a specified “target population” of interest. For example, in a single-arm clinical trial, the target population can be defined by the treated patients, and the key is to find out what happens to them if they were not treated. Yet, the data available may go way beyond the target population. As such, a crucial question is how to best integrate all the information to improve the inference for the target. We innovate a distance-segmented regression (DSR) framework, where (i) our focus is on finding the conditional relationship between some outcome and predictors, only when the sample profiles lie within a target population, (ii) a distance metric can be used to measure how close each sample is to the target, and (iii) the conditional association may remain unchanged when slightly departing from the target and then start to exhibit some smooth deviation. We investigate the estimation and inference procedures for DSR, and demonstrate that our guided way of utilizing information can outperform the two commonly-used alternatives, i.e., using all the data or using only the data within or close to the target.
|