A new dependence parameter approach to improve the design of cluster randomized trials with binary outcomes
*Catherine M. Crespi, University of California Los Angeles 
Weng Kee Wong, University of California Los Angeles 
Sheng Wu, University of California Los Angeles 

Keywords: intraclass correlation coefficient, clustered binary data, cluster randomized trials, sample size calculation, power

Power and sample size calculations for cluster randomized trials require prediction of the degree of correlation that will be realized among outcomes of participants in the same cluster. This correlation is typically quantified as the intra-cluster correlation coefficient (ICC), defined as the Pearson correlation between two members of the same cluster or proportion of the total variance attributable to variance between clusters. It is widely known but perhaps not fully appreciated that for binary outcomes, the ICC is a function of outcome prevalence. Hence the ICC and the outcome prevalence are intrinsically related, making the ICC difficult to generalize across study conditions and between studies with different outcome prevalence rates. We use a simple parameterization of the ICC that allows that part of the ICC that measures dependence among responses within a cluster to be isolated from the outcome prevalence. We incorporate this parameterization into sample size calculations for cluster randomized trials and show that whereas our method leads to sample size requirements that achieve the desired level of power, the traditional approach using the ICC tends to overpower studies under many scenarios. We show how estimates of this newly defined dependence parameter, R, can be obtained from previous studies using simple statistics. The R parameter also has an intuitive meaning that facilitates interpretation. Thus the R parameter has a number of advantages over the ICC as a measure of dependence among binary outcomes in cluster randomized trials.