Cluster Randomised Trials

Most evaluations involve the assessment of interventions that occur at units larger than that of individuals. These might be health system catchment areas, schools, geographic regions or even countries. In some cases it is appropriate and possible to randomly allocate the intervention of interest at the level of these larger units, and as such apply the most efficient and reliable approach to reducing confounding.

Key Resources For Learning About Cluster Randomised Trials

new cluster randomised trials website has been developed to support those conducting cluster randomised trials and stepped wedge designs and those carrying out methodological research on these designs. The website has the latest publications, software, discussions and events related to clustered design.

This book by Richard Hayes and Lawrence Moulton has become the leading methodological text in this area:

Cluster Randomised Trials
Hayes R, Moulton L. Cluster randomised trials. Chapman and Hall/CRC Press , Boca Raton , FL , 2009

A short summary of what the book offers is provided in regards to these topics:

Rationale and Limitations of Cluster Randomised Trials

There are several circumstances in which cluster randomised trials are appropriate and may be chosen over other randomised designs. Some interventions are by their nature meant to be applied to whole communities rather than individual people such as educational programs or improving water supplies in villages. In some circumstances, cluster randomisation offers more logistical convenience or would be received with greater acceptability when delivered to the entire population rather than at individual level. Cluster randomised trials are also an effective way to avoid contamination and this is one of the most common reasons for adopting this design. Finally, CRTs allow both the directand indirect effects of an intervention to be captured, providing a measure of the overall effect of implementing an intervention throughout a population. This is particularly useful when considering infectious diseases. Those receiving the intervention benefit from both the direct effect of the intervention on susceptibility to the infection and the indirect mass effects resulting in a reduction in the exposure to the infection.

When considering a cluster randomised design, these advantages needs to be weighed against the limitations.Statistical and cost efficiency are important to consider. The power and precision of a cluster randomised trial is lower than an individually randomised trial, and the logistical aspects of working in several different clusters may render a CRT expensive to implement. Other issues to consider are selection bias, imbalance between study arms, and generalisability. The rationale along with the limitations and strategies to minimise them, are discussed in Chapter 3.

Design Considerations

Design Choices for Treatment Arms

Parallel Group Design

This is the most common design for both individually and cluster randomised trials. Under this design, each cluster remains in the arm it was randomly allocated to throughout the whole trial.

Three arm trials

Given the expense and logistical complexity associated with CRTs, and the difficulty in enrolling enough clusters to provide an adequate sample size in each treatment arm, the great majority of CRTs follow a study design in which clusters are randomised to only two treatment arms. Three-arm trials are sometimes feasible, however, CRTs with more than three arms are very uncommon. However, when considered, they follow two main approaches: The first compares two different interventions with a control arm, and the second compares the same intervention given at varying levels of intensity with a control arm to produce a dose response analysis.

Factorial trials

Conventionally, to estimate the effect of two interventions would require either designing two trials or conducting a three-arm trial, which has the disadvantage of a smaller sample size in each arm. Factorial designs allow the study of the independent effects of two interventions in the same trial. This has the advantage of being cost efficient, and conserving sample size. The design takes a 2 X 2 layout resulting in four treatment arms: one arm receiving the first intervention, another receiving the second intervention, an arm receiving both interventions, and finally a control arm. The model results in four treatment arms, however, estimating the effect of each intervention is done by comparing a relevant combination of two of the arms against the combination of the remaining two arms. This approach is only valid if there is no interaction between the interventions. Where interactions are expected, or desired, factorial designs can be used to identify the joint effect of two interventions, however larger sample sizes may be required.

Cross Over Design

The aim of this design is to control for time trend. This design is commonly used in individually randomised trials, and has been adopted for CRTs. Each cluster receives two treatments, one after the other. There is often a period in between called the washout period to avoid any carry-over effects.

Stepped Wedge Design

Click here to learn more about this design.

Type and Size of Clusters

One of the first decisions to be taken when designing a CRT relates to the choice and definition of the clusters that are to be randomised during the trial. There is a wide variety of types and sizes of clusters ranging from families or households with a few individuals, to large geographic areas containing millions of individuals. The practical elements of implementing such trials are very different. Chapter 4 considers the different types of study cluster and discusses the key issues to be considered in choosing cluster size.


Contamination occurs when responses in one cluster are distorted because of contact with individuals from outside the cluster, and this may still occur and pose an important problem in CRTs. This could happen due to contact between the intervention clusters and the control clusters. It could also happen due to contact between the intervention clusters or the control clusters and the wider population. Strategies to reduce the degree of contamination in a CRT include selecting clusters that are sufficiently distant and well separated from each. In circumstances when geographic zones are assigned to either the intervention or control arms rather than specific communities, buffer zones are used to that clusters do not have common boundary between them. These two strategies are used to ensure that contamination does not occur between the intervention and control clusters. The ‘fried egg design’ is a strategy used to reduce contact between the intervention or control clusters and the wider population. The ways by which contamination occurs and the strategies to reduce them are discussed further in Chapter 4.

Approaches to measuring outcomes from individuals

The outcomes of interest are measured from a sample of individuals selected from each cluster. There are two main approaches to the measuring individuals, depending on the outcome: cross sectional surveys or cohorts. A full discussion on when each may be used and their advantages and disadvantages is found in Chapter 8.

Repeated cross sectional Samples

Cross sectional surveys require taking a repeated sample from each cluster at different times. It is used when the measure of the outcome is a binary outcome (such as HIV or smoking prevalence) or a quantitative endpoint (such as the mean cholesterol level or mean height of children).

Cohort Follow Up

The cohort approach involves following up selected individuals over time. This is used when the measure of the outcome is a rate or risk of events occurring during a specified follow up period. The cohort can consist of the total population of a cluster or a random sample from that cluster. When the total population is to be followed up, it must be specified whether new people entering the population at a later date will be considered or to limit the study to only those seen at baseline.

Sample size

When designing a CRT, sample size is one of the most important factors to consider. Inadequate sample size increases the random error, reduces the power of the study, and thus reduces the ability to quantify effect accurately. Chapter 7sets out in detail the methods needed to select an appropriate sample size for a CRT. This includes methods for unmatched, matched, and stratified study designs as well as methods to select an appropriate sample size for each cluster.

Features Requiring Special Methods of Design and Analysis

Inter-cluster Correlation and Between-cluster Variability

In individually randomised trials, individuals are assumed to provide statistically independent observations in the outcome of interest. However, this assumption is not true in CRTs as observations on individuals within the same cluster tend to be correlated. This means that knowledge of one individual’s outcome will tend to provide information about the outcome of another individual in the same cluster. Inter-cluster correlation occurs in CRTs for three main reasons:

Clustering of population characteristics

Variations exists between different populations due to differences in the individuals that make up each cluster such as demographic or socioeconomic characteristics, or due to differences in cluster-level variables such as environmental characteristics of the cluster.

Variations in response to Intervention

Different clusters may respond differently to the interventions which results in variations in the outcomes between clusters even if the variation in outcomes between clusters was absent before the intervention.

Correlation Due to Interaction between individuals

Cluster randomisation may be particularly important in trials of interventions where one individual in that cluster may have either a direct or indirect effect on the outcome in other individuals such as interventions against infectious diseases or health education programs where educational messages are discussed by members of the community leading to similarities in behaviour.

The extent of the inter-cluster correlation depends on the existence of other clusters and the nature and size of the clusters. Inter-cluster correlation depends on the existence of other clusters: it has no meaning if there is just one study population, in one cluster, under consideration. Additionally, it is also only exists if there is true variability in the outcomes between clusters. Therefore, inter-cluster correlation and between-cluster variability can be thought of as corresponding concepts that provide two different perspectives on the same underlying phenomena. The inferences that can be made from a CRT depend on the degree of between-cluster variability in the outcome of interest, thus, should be measured appropriately and be considered in the design and the analysis of a CRT. There are two approaches by which between-cluster variability can be summarised: coefficient of variation between clusters, and the intra-cluster correlation coefficient. These are discussed thoroughly in Chapter 2 of the book.

Study Arm Imbalances

Due to practical and financial constraints, the number of clusters randomised in a CRT is often quite small compared to the numbers of individuals typically recruited to an individually randomised trial. With a small number of clusters, randomisation does not ensure that the two arms are balanced, so an imbalance between study arms on one or more potential confounding factors is a risk when simply randomising a small number of clusters. Design strategies such as matching and stratification can be used to improve the balance between treatment arms and to reduce between-cluster variability. These are discussed in Chapter 5 and guidelines on when these strategies should be used is also provided.

Matching can help minimise the differences between treatment arms with respect to baseline characteristics, and can improve the power and precision of the study. If there is substantial between-cluster variability, it may be decided to first group together clusters that are expected to be similar with respect to the outcome of interest, and to allocate the treatment within these groups. Grouping the clusters into similar pairs ensures that the treatment arms are similar at baseline, at least with respect to the characteristics we choose to match on.

Stratification involves grouping of available clusters into two or more strata that are expected to be similar with respect to the outcome of interest. The clusters within each stratum are then randomly allocated between the treatment arms. Stratification has several advantages over the matched design.

Matched and stratified designs are examples of restricted randomisation since these schemes involve selecting randomly from a smaller set of allocations fulfilling certain restrictions.

While these designs may help reduce imbalances between the treatment arms, there are circumstances where they cannot be relied upon to achieve adequate balance, particularly when there are several variables on which balance is required. In such circumstances, another approach to restricted randomisation that achieves overall balance between the treatment arms can be employed. Overall balance refers to when each of the variables is similarly distributed across treatment arms and does not require that there is balance within subgroups. This is done using baseline or pre-existing data on each cluster, and restricting to allocations that satisfy certain pre-determined balance criteriaChapter 6explains this approach to restricted randomisation and describes the types of variables on which balance would be required, how to define the balance criteria that would restrict allocations, and the circumstances under which re-enumeration of allocations should be considered. When a restricted randomisation scheme is used, there is a risk of producing a design that is biased or not valid, which results in standard methods of statistical inference giving incorrect results. This chapter also explains what is meant by bias and validity, when they might occur, and how to account for them.


There are two main approaches: analysis based on cluster-level summary measures, and analysis based on individual-level data using regression methods that allow for inter-cluster correlations.

The primary principle of both these methods that they take into account the two key features of CRTs discussed earlier: inter-cluster correlations, and chance imbalances between study arms resulting from a small number of clusters.

The book does not detail all of the possible methods that can be used for the analysis of CRTs, rather it focuses on the ones that have been proven to be efficient and robust in Chapters 9 – 12.

The analysis method should be appropriate for the specific design.

Reporting and Interpretation

There is a growing body of evidence and experience of cluster-randomised trials for assessing the impact of interventions on health outcomes, and the extended CONSORT guidelines is available to guide the reporting of such trials:

Consort 2010 statement: extension to cluster randomised trials.
Campbell MK, Piaggio G, Elbourne DR, Altman DG. Consort 2010 statement: extension to cluster randomised trials.

Chapter 15 of the Cluster Randomized Trials book by Hayes and Moulton discusses and explains the CONSORT guidelines.