Online Program

Return to main conference page
Tuesday, September 26
Tue, Sep 26, 11:45 AM - 1:00 PM
Various Rooms
Roundtable Discussions

TL22: Cluster-Randomized Trials: Considerations for Power and Analysis (300443)

*Todd Durham, QuintilesIMS 
*William Hawkes, QuintilesIMS 

The goal of this Roundtable will be to acquaint the participants with some unique features of cluster-randomized designs that affect power and to discuss various analytical strategies that can be used to analyze the data once collected. The use of randomized pragmatic trials in real world evidence generation faces certain challenges, for example when randomization of patients within a single practice might lead to contamination of subjects, masking of effects, and logistical issues. To deal with this risk, cluster randomized designs have been used. These designs randomize sites (practices) rather than individuals. The impact of this approach on power is to reduce the power of a given analysis as a function of the sizes of the sites (known as clusters) and the within-cluster correlation between patients in a given site. To account for this ‘design effect’ of cluster randomized studies, inflation of the sample size by a factor equal to 1 + (n-1)*rho is required, where n is the cluster size (number of patients per practice) and rho is the intra-class correlation coefficient, a measure that quantifies the within-cluster correlation. Even values of rho as small as 0.05 can force a significant increase in the number of subjects needed. What is not obvious is that reducing the size of the clusters (sites) but increasing their number is a more effective way to reduce this sample size inflation. Increasing the number of subjects per site will afford only very modest increases in statistical power, and these increases diminish rapidly with increasing cluster size. Furthermore, if patient-level analyses are undertaken, the value of the test statistics for t-tests and F- tests should be divided by the square root of the design effect described above. The result will be to weaken the statistical test evidence as a function of both cluster size and within-cluster correlation. A growing body of analysis techniques are available that account for within-cluster correlations, including Generalized Estimating Equations and Mixed-Effect models. These are easily implemented in SAS PROC GENMOD, PROC MIXED, and PROC GEE. These procedures allow for explicit specification of the covariance matrix, robust standard error estimates, and are relatively tolerant of missing data under certain assumptions about the mechanism underlying Missingness. We will discuss these approaches and compare and contrast them with analyzing the clusters as the units of observation.

Questions for participants: Have you worked on a cluster-randomized study before? If so, how did you handle the power analysis, and what was the degree of impact of the cluster-randomized design on the sample size required? How would you analyze data from a cluster randomized trial? What factors do you consider in your choice? For what kinds of research questions will cluster randomized trials be most useful? What are the most pressing statistical challenges we need to address before cluster randomized reach their potential?