The hamlet leader accompanies us, as Tatu later explains to me, because he knows each household in the hamlet and where the boundaries of nearby hamlets are, but also because he speaks the local vernacular, a variant of Swahili that she has been attempting to learn so that she will not be required to take all her answers through his interpretation.
As soon as this census is finalized the trial will enter an operational limbo; distribution of bednets cannot commence until a decision has been made on which intervention group each household belongs to. While theoretically the virtue of a Randomized-Controlled Trial (RCT) is that those decisions are left to a computer, the reality is much more complicated.
One of the first assumptions we must contend with in statistics is that all variables gathered in a sample are independent: that is to say that no sampled variable is related to another. This of course is impossible. Variables are highly interlinked: malaria is closely tied to a person’s socioeconomic status, which is closely tied to the type of house they live in, to what work they do, to where they live, which is again tied to malaria, and the people in one area are more likely to have similar variables than those in another. One thing we can attempt to do in a trial of this scope is to control for these dependencies.
The first step we take in this endeavor is to aggregate households into groups or “clusters.” In an ideal scenario we would hope that households within these clusters were different in exactly the same way from others in the study, but in practice we are aiming for them to be more similar to the others in their cluster than those without. By doing so we may weight these households by an inter-cluster coefficient (ICC) which accounts for some of the disparate effects that an intervention may have on a community due to their similarity within the cluster, and difference from those in another. This process makes a Cluster-Randomized-Controlled Trial (CRTC) in which we can allocate interventions based on the cluster, rather than the individual.
CRTCs are the standard in malaria trials for a number of reasons. The first being the difficulty of randomizing on the household and the logistics of delivering a blinded intervention to each of 40,000 locations, ensuring that they do not swap nets with their neighbors, and linking effects back from individual households to the intervention they have been given. The second is that there is a community effect to bednet trials: a grouping of households sharing the same intervention produces a greater antimalarial effect than a single household with a bednet. In order to accurately capture this effect, as would be achieved in a large-scale bednet rollout, the same intervention must be given to the entire community.
In creating a cluster map of the Misungwi region we first begin with the minimum unit of aggregation. This study considers people (particularly children), who live in households, which are organized into hamlets, which are part of villages, which are placed in wards. Each of these constitutes a unit of aggregation. In some cases we can dismiss levels without much though: all the people within a household often sleep in the same place, so having two or more bednets, and ensuring the right people slept under the right net would be unmanageable. Similarly, there are only a handful of wards in Misungwi, so dividing based on wards would make it impossible to have balanced intervention groups. Ideally, we use the smallest unit of aggregation possible, as this allows for the most nuance in the creation of clusters. In this case the decision was made for us, as the trial requires the assistance of hamlet leaders to distribute bednets, so no cluster may cross a hamlet boundary.