Sample Size Requirements to Test Subgroup-Specific Treatment Effects in Cluster-Randomized Trials
October 10, 2023
IMPACT members Keith Goldfeld, DrPH, MS, MPA, Fan Li, PhD, Monica Taljaard, PhD, and Xueqi Wang, PhD, are among authors of an article that introduces significant advancements in the methodology of cluster-randomized trials (CRTs). While previous research has primarily focused on sample size methods for testing differences between subgroup-specific treatment effects, this study addresses a critical gap by proposing formal procedures for directly testing the subgroup-specific treatment effects themselves. The authors emphasize the importance of understanding whether interventions are effective in predefined participant subgroups, particularly in healthcare delivery interventions to meet health objectives.
Abstract
Cluster-randomized trials (CRTs) often allocate intact clusters of participants to treatment or control conditions and are increasingly used to evaluate healthcare delivery interventions. While previous studies have developed sample size methods for testing confirmatory hypotheses of treatment effect heterogeneity in CRTs (i.e., targeting the difference between subgroup-specific treatment effects), sample size methods for testing the subgroup-specific treatment effects themselves have not received adequate attention—despite a rising interest in health equity considerations in CRTs. In this article, the authors develop formal methods for sample size and power analyses for testing subgroup-specific treatment effects in parallel-arm CRTs with a continuous outcome and a binary subgroup variable. The authors point out that the variances of the subgroup-specific treatment effect estimators and their covariance are given by weighted averages of the variance of the overall average treatment effect estimator and the variance of the heterogeneous treatment effect estimator. This analytical insight facilitates an explicit characterization of the requirements for both the omnibus test and the intersection–union test to achieve the desired level of power. Generalizations to allow for subgroup-specific variance structures are also discussed. The authors report on a simulation study to validate the proposed sample size methods and demonstrate that the empirical power corresponds well with the predicted power for both tests. The design and setting of the Umea Dementia and Exercise (UMDEX) CRT in older adults are used to illustrate our sample size methods.