ASA Ethics Case Study #3

The following article is a case study concerning the ethical use of statistics. You are invited to read it and join the discussion on it. Please send your contributions in the form of an e-mail to the Ethics Committee Chair, specifying in the subject line of your message the case study you wish to contribute to.

 

Ethics of Data Quality

A large company serves both government and private clients. During normal operations, it collects a huge amount of data which is unavailable anywhere else. These data are used internally and also used to meet information requests from clients, the media, researchers, and the general public. The data are shared with government agencies. In some uses, the data have significant social impact.

A statistician who meets requests for information based on this data is concerned because there are no control processes in place to assure uniform quality of the data. There are no audit procedures by which any particular counts or compilations could be verified independently. She feels that, on the whole, the data are probably "pretty good" but are likely to vary widely in quality from one data set to another. She has proposed creation of a statistical services group, which would institute data quality standards and procedures, as well as improve the availability of analytic products using these data. Her proposal has been applauded by management but perpetually left unfunded.

Colleagues with whom the statistician has discussed this matter point out that thousands of data sets lacking data quality standards exist and are widely used. They also point out that even where data quality control standards are in place, it can take years or decades to identify and resolve specific data quality problems. Still, the individual involved is highly uncomfortable ethically with her role in preparing compilations and reports on this data. She does not want her professional reputation on the line with such products given that the recipients do not know what they are getting. She is considering adding a disclaimer to each data product to inform customers about the lack of data quality control. She is also very tempted to resolve the issue by taking other available employment.

You are a close friend of this person, and she has asked for your advice. You are not employed by the same organization and do not know its internal politics or culture. Still, she values your judgment highly, especially in matters of professional ethics. Your advice is quite likely to be the deciding factor in her decision about what course to take.

What issues of statistical ethics are involved here?

What would you advise this statistician to do?

Does your advice change depending on the data subject matter, say demography versus transportation or product safety data, utility services usage versus public health-related data?

Does your advice depend on the individual's level of responsibility in the organization, say technician versus middle management versus executive?