Abstract:
|
A fundamental problem in the practice and teaching of data science is how to evaluate the quality of a given data analysis, which is different than the evaluation of the science or question underlying the data analysis. Previously, we defined a set of principles for describing data analyses that can be used to create a data analysis and to characterize the variation between data analyses. Here, we introduce a metric of quality evaluation that we call the \textit{alignment} of a data analysis between the data analyst and an audience. We define a successfully aligned data analysis as the matching of principles between the analyst and the audience on which the analysis is developed. In this paper, we propose a statistical model and general framework for evaluating the alignment of a data analysis. We argue that this framework can be used as a guide for practicing data scientists and students in data science courses for how to build better data analyses.
|