Abstract:
|
In cell biology and cancer research, scientists are often interested in identifying important proteins associated with the binary response variable, such as the disease status or response to a specific treatment. The expressions of these proteins are measured over time and can be treated as continuous curves over time. These functional expressions are then used as covariates to build a logistic regression model. As the number of such covariates is much larger than the sample size but with a sparse underlying structure, variable selection is required to identify the important proteins to build a final model. We extend the classical LASSO for logistic regression with functional covariates using a basis representation of the functional covariates and using a penalty similar to group LASSO by treating the coefficients from the same functional covariate as a group. We utilize an efficient and computationally tractable algorithm which can also be applied to generalized linear models to solve the corresponding convex optimization problem. The group lasso estimator for functional logistic regression is shown to be statistically consistent. The consistency of the methodology is validated
|