Abstract:
|
We propose a hybrid framework that leverages machine learning (ML) techniques to make decisions about data processing steps preparatory to a formal statistical inference. We show that, by resampling from the data at hand, a ML algorithm can be trained that is tailored to the specific setting of the statistical analysis and that offers informed recommendations to guide the course of that analysis. Monte Carlo experiments show this method’s effectiveness and allow us to explore its degree of effectiveness depending on the classifier’s architecture. We finish with an application to change-point estimation, using data on fatigue crack growth in additively manufactured titanium. Crack growth rate never decreases under increased stress, but is classified into distinct regimes to determine material properties. Isotonic regression is thus applicable, due to the monotonic structure, but may not benefit the change-point estimation. For this example we use a ML classifier tailored to the available data to decide whether to apply isotonic regression preparatory to estimating the change-point separating the different crack growth regimes.
|