Abstract:
|
Model selection is a task of fundamental importance in statistics, and advances in high-dimensional model selection have been one of the major areas of progress over the past 20 years. Much of this progress has been due to penalized methods such as the lasso, and efficient methods for solving the relevant convex optimization problems that arise in exponential family models. However in some model classes, such as directed graphical models, correct model selection is provably hard. We give a geometric explanation for why standard convex penalized methods cannot be adapted to directed graphs, based on the local geometry of the different models at points of intersection.
These results also show that it is 'statistically' hard to learn these models, and that much larger samples will typically be needed for moderate effect sizes. This has implications for other types of graphical model selection, and especially for causal models, as well as time series models. We provide some relevant heuristics that give insights into the feasibility of model selection in various classes of graphical model, including ancestral graph models.
https://arxiv.org/abs/1801.08364
|