A recent presentation by Dr David Ogilvie on Physical Activity for Public Health: In pursuit of rigorous evaluation in the real world highlighted an issue close to my heart – the challenges of evaluating public health interventions.
Dr Ogilvie is visiting from the University of Cambridge as “thinker in residence” with the Prevention Research Collaboration at the University of Sydney. His presentation described two of several very sophisticated evaluations of public policies and interventions designed to prioritise walking, cycling and the use of public transport. It highlighted the difficulties of balancing scientific rigour with the practicalities of implementing complex interventions (such as the construction and operation of a guided bus route and complementary walking and cycling routes).
As a panellist at the seminar, which was organised by The Australian Prevention Parternship Centre, the Sax Institute and the Prevention Research Collaboration, I reflected on my own experiences 30 years ago when I was grappling with the challenges of evaluating a complex, multi-level intervention to tackle heart disease in Wales in the UK – the Heartbeat Wales program.
An evaluation challenge
We had established a multi-level evaluation design to monitor change in knowledge, social norms, and behavioural and clinical risk in individuals, as well as change in a set environmental measures ranging from human and financial resource commitment to media activity, and change related to specific interventions we conducted with schools, supermarkets and other sites. We collected data over five years in Wales and in a matched region in the North East of England with a comparable population and heart disease profile.
After five years of intervention, progress in Wales was generally very positive in relation to targeted changes. Sadly from an experimental (but not a public health perspective) equally impressive changes could be observed in our reference community.
Investigation revealed that a delegation from the North East had visited Wales two years into our intervention and had observed, then rapidly adopted, many of the innovations we were testing in Wales. Ultimately they invested more in their interventions than we were able to do in Wales, and without the burdensome evaluation that we were obliged to undertake.
Was the Heartbeat Wales program a success? Dr Ogilvie’s presentation prompted me to reflect on the insidious nature of the prevailing “gold standard” of evaluation research ‒ randomised controlled trials (RCT). As a scientific study, Heartbeat Wales was undoubtedly compromised. At best it was a flawed quasi-experimental design, and the intervention was hopelessly “contaminated” by the pesky interventions of public health activists in the North of England. It would be dismissed in contemporary systematic reviews of evidence as being of no scientific value. This is patently not the case.
Regrettably, large-scale experiments epitomised by Heartbeat Wales simply don’t happen anymore, and the type of work that Dr Ogilvie champions is difficult to get funding for, technically challenging to undertake and correspondingly unattractive to many researchers who are under pressure to win grants and get publications.
One unintended consequence is that many research questions of great public health significance that do not lend themselves to RCTs remain unanswered. We have become masters of learning more and more about less and less ‒ providing ever more sophisticated answers to the wrong questions.
There can be no doubt that RCTs offer widely valued evidence of causality in interventions, and have potential application in a wide variety of circumstances, particularly interventions directed at single issues. But improving public health is a complex process. A comprehensive program might consist of multiple interventions working synergistically to achieve several outcomes ‒ think of all the different fiscal, regulatory, health service and public education interventions that were required to reduce smoking rates in the population. Reducing an integrated set of interventions to the component parts for the purposes of evaluation almost invariably results an irretrievable loss of the “whole” ‒ providing good answers to the wrong question.
Evaluation designs have to be tailored to suit the nature of the intervention and the local context. Because of their multifaceted nature and dependence on context, most public health interventions require adaptations during implementation. Our evaluation designs and methods need to accommodate this.
There is some progress in achieving wider recognition of the strengths and weaknesses of trial methodologies in the evaluation of complex health promotion interventions. But this this is not always the experience of those who apply for research grants or look to get papers published in journals, and is not obvious in published criteria for assessing the “quality” of evidence from evaluations of public health interventions.
The simple fact remains that all evaluation designs have strengths and weaknesses, but the best evaluation designs combine different research methodologies, both quantitative with qualitative, to produce a diverse range of data that will optimally support both the integrity of the study and its external usefulness. Greater recognition of the place of natural experiments, and more use of outcome measures designed to assess success in changing social, economic and environmental determinants of health are needed.
The judgment of scientific merit needs to be based on scientific rigour, and not confused with methodology. Using qualitative research methods to understand context and implementation are as important as observing outcomes, and essential to understanding causality and generalisability.
We need to work with policy makers to understand more clearly the type of questions that need answering, and to continue to develop the research methods that deliver the best possible answers to questions of greatest public health importance. The Sax Institute has a unique and vital role in this dialogue.
Alongside this we have to argue for funding of more “real world” complex and technically challenging intervention evaluation. The current review of NHMRC funding programs offers a near-term opportunity to do this.
Professor Nutbeam is a Senior Adviser at the Sax Institute and a Professor of Public Health at the University of Sydney. He is co-author with Professor Adrian Bauman of the book Evaluation in a Nutshell: A practical guide to the evaluation of Health promotion programs.