Abstract:
|
In penalized regressions, selection probabilities of covariates depend on the sizes of the coefficients as well as their distributions. Therefore, candidate covariates of different units need to be standardized before being fed into penalized regressions so that the selection probabilities of covariates can reflect their relative impacts, which leads to fairer variable selection. However, when covariates of mixed data types (e.g. continuous, binary or categorical) exist in the same dataset, the commonly used standardization methods may lead to different selection probabilities even when the covariates have the same impact on or association with the outcome. In the paper, we propose a novel standardization method that targets at generating comparable selection probabilities in penalized regressions for continuous, binary or categorical covariates with same impact. We illustrate the advantages of the proposed method in simulation studies, and apply it to the National Ambulatory Medical Care Survey data to select factors related to the opioid prescription in US. The proposed standardization method demonstrates superiority in both simulation and real data analysis.
|