Abstract:
|
The problem of valid uncertainty quantification for very high-dimensional data has not been adequately addressed in the literature, and is of paramount importance in many applications (e.g, scientific studies). We are particularly interested in the setting in which the data are massively higher dimensional than the sample size. In this scenario, it has become commonplace to make very restrictive assumptions about low-dimensional structure, with such assumptions being unverifiable based on the data at hand. Such a practice can lead to the propagation of errors in the literature. Instead of incorporating overly-strong assumptions, we propose to reduce the dimensionality of the questions being asked of the data, while maximizing fidelity with an initial set of fine scale questions. We minimize the degree of coarsening subject to restrictions on the accuracy of the answers provided. Theoretical justification is provided, and this approach is illustrated through applications to neuroscience studies.
|