Bayesian Inference and Sensitivity Analysis with Missing Data
View Presentation View Presentation
*Joseph W. Hogan, Brown University 

Keywords:

This talk provides a Bayesian perspective on inference and sensitivity analysis with incomplete data. Assessment of model sensitivity is a broad topic, encompassing many aspects of inference that might include distributional assumptions, parametric structure, sensitivity to outliers, and assessment of influence of individual data points. When the intended sample is completely observed, many of these modeling assumptions can be checked empirically; our ability to refute the assumptions with any degree of confidence is limited only by sample size, so in some sense these assumptions can be subjected to empirical critique. Assumptions required for fitting models to incomplete data are different because they apply to data that cannot be observed and are therefore inherently untestable. Put simply, they are subjective.

The need for subjectivity in analyses of incomplete data is not frequentist or Bayesian, parametric or nonparametric; it is just a feature of the problem. But this feature makes analysis of incomplete data fertile ground for the Bayesian approach, in which subjective components of a model are formally represented in terms of prior distributions. Approaches to inference from incomplete data are essentially based on one of three approaches: (i) making an assumption such as missing at random (MAR), and assuming it holds without further critique; (ii) fitting and comparing models under several different assumptions about the missing data distribution; (iii) reporting bounds rather than point estimates to reflect the incompleteness of information in the data (as in Manski, 2007).

In this talk, we demonstrate how models can be parameterized in ways that make clear where purely subjective input is being used. Degree of uncertainty about untestable assumptions is encoded in a prior and is ultimately reflected in posterior distributions. Bayesian sensitivity analysis is therefore an assessment of the impact of prior distributions used to represent the structure of and uncertainty about purely subjective assumptions. These ideas are illustrated using several analyses of data from the Commit to Quit Study, a randomized trial examining the effect of vigorous exercise on smoking cessation among women (Marcus et al., 1999).