Abstract:
|
Efficiently producing transparent analyses may be difficult for beginners or tedious for the experienced. This implies a need for computing systems and environments that not only satisfy reproducibility and accountability standards, but can allow analysts to explore, operate, collaborate on, and refine complex analytical structures. To this end, we have developed an R package and R Shiny application called 'adapr' (Accountable Data Analysis Process in R) that is built on the principle of accountable units. An accountable unit is a data file (statistic, table or graphic) that can be associated with a provenance, meaning when and how it was created and who created it. Accountable units have similarities with the concept of verifiable computational results proposed by Gavish and Donoho. A key element is that an individual accountable unit is sharable and can be incorporated into a collaborative project. Reproducing collaborative work may be highly complex, requiring repeating computations on multiple systems from multiple authors; however, determining the provenance of each unit is simpler, requiring only a search using file hashes and version control systems.
|