Keywords: graphics, false positive, false negative, exploratory research, GWAS
Introduction Exploratory research using large datasets often takes the form of ‘fishing,’ where multiple measures (e.g. genes, nutrients) are simultaneously cross-correlated with one or more outcomes. Such research is vulnerable to both type 1 and type 2 error; pooled effect estimates miss true heterogeneity of results. We present a method for exploring associations between a single predictor and a set of outcomes. Methods To illustrate the method we correlated crime committed in Los Angeles with lunar phase and gender of victim. For each crime we calculated a summary measure of effect size and plotted it against sample size with ‘no association’ marked. Presence of main effect was defined as a scatterplot centered far from ‘no association’; presence of heterogeneity was defined as a scatterplot ranging far from ‘no association.’ Where heterogeneity was present, extreme-valued points were examined for similarities. Methods Both graphs showed nearly null main effects, but one (victim gender) had significant within-outcome heterogeneity while the other (lunar phase) did not. The plot of sample size against mean lunar illumination was nearly triangular and centered on ‘no association,’ and crimes which tended to occur near full or new moon had little in common besides rarity. There was also little main effect of gender, but significant heterogeneity. Offenses against each sex tended to cluster: sex offenses tended to victimize females, while attacks on emergency workers tended to victimize males. Interpretation This multifactorial outcome showed null associations with lunar phase, but not sex. Future analyses of lunar effects may safely ignore heterogeneity between crimes; future analyses of gender may collapse “sex offenses” into a cluster but separate them from attacks on emergency workers. Similar tools may be used to reduce type 1 and type 2 errors in other datasets.