Academics have frequently referred to a 'replication crisis' precisely because the results of so many studies were difficult to reproduce. In education, trials were often carried out by a mix of teachers and researchers.
Larger studies, in particular, created ample opportunities for inadvertent fidelity losses, either through human factors (such as research instructions being misread), or changes in the research environment (for example to the timing or conditions of the test). Ellefson and Professor Daniel Oppenheimer from Carnegie Mellon University developed a computer-based randomised control trial, which, in the first instance, simulated an imaginary intervention in 40 classrooms, each with 25 students.
They ran this over and over again, each time adjusting a set of variables - including the potential effect size of the intervention, the ability levels of the students, and the fidelity of the trial itself. In subsequent models, they added additional, confounding elements which might further affect the results - for example, the quality of resources in the school, or the fact that better teachers might have higher-performing students. The study combined representative permutations of the variables they introduced, modelling 11,055 trials altogether.
Strikingly, across the entire data set, the results indicated that for every 1 per cent of fidelity lost in a trial, the effect size of the intervention also dropped by 1 per cent. This 1:1 correspondence meant that even a trial with, for example, 80 per cent fidelity, would see a significant drop in effect size, which might cast doubt on the value of the intervention being tested. A more granular analysis then revealed that the effect of fidelity losses tended to be greater where a bigger effect size was anticipated.
In other words, the most promising research innovations were also more sensitive to fidelity violations. Although the confounding factors weakened this overall relationship, fidelity had by far the greatest impact on the effect sizes in all the tests the researchers ran. Ellefson and Oppenheimer suggested that organisations conducting research trials might wish to establish firmer processes for ensuring, measuring and reporting fidelity so that their recommendations were as robust as possible.
Their paper pointed to a research in 2013 which found that only 29 per cent of after-school intervention studies measured fidelity, and another study, in 2010, which found that only 15 per cent of social work intervention studies collected fidelity data.