Tips and tricks for Evidence Synthesis: Session 2

Reviewing complex qualitative data: new approaches

August 29, 2021

The second of the four-part seminar series exploring evidence synthesis took place on the 30th of July. We were pleased to welcome Dr. Salla Atkins, Associate Professor at Tampere University (Finland) and the Karolinska Institutet (Sweden), and Prof. Chris Bonell, Professor of Public Health and Sociology at the London School of Hygiene and Tropical Medicine.

Part 1: Dr. Salla Atkins - Getting meaningful results from sampling in qualitative evidence synthesis

The problem: an ever-increasing number of eligible papers

This talk was an interesting look at the challenge of having larger numbers of available research for qualitative evidence synthesis and how sampling can be used to manage this. Dr Atkins started by highlighting why we do qualitative research syntheses. They allow us to pull together qualitative evidence to inform practice and future interventions about a particular issue or to generate new theories about how interventions work. One of the challenges of doing qualitative reviews is the ever expanding number of potential papers to include, which are often of varying quality. Sampling is one way to approach this, which Dr Atkins explored through two cases.

To put us in context, the first example was a study published in 2007 by Munro et al. The main aim of this study was to understand the factors considered important by patients, caregivers and healthcare providers in contributing to tuberculosis (TB) medication adherence. 19 databases from 1966-2005 were searched, producing 7814 records, of which 44 were included and synthesised. Dr Atkins highlighted that a simple search was constructed using the terms “tuberculosis” and “adherence or concordance”.

By contrast, the second and more recent example is an ongoing review by Dr Atkins and colleagues (whose protocol is published here). The main aim of this review is to explore how conditional and unconditional cash transfers aimed at impacting health behaviours are experienced and perceived by recipients. Databases were searched from 1990 onwards. 7292 citations were assessed for eligibility, and a systematic process lead to the inclusion of 128 papers in total.

For this study, the institution’s librarian and the Cochrane group were involved in the development of the search strategy, meaning that the search was very specific and well defined. So, by contrast with the 2007 example, 7814 studies from a broad heading search were retrieved in 2007 versus 7292 articles from a highly specific one in 2021, highlighting the exponential growth in publications over this 15-year period!

Sampling strategies; what’s the big deal?

Dr Atkins argued that this exponential growth underlines the need for good sampling strategies to produce meaningful and beneficial qualitative reviews:

  • 128 papers are too many to make for good analysis
  • Even though software like atlas.ti can help with the analysis, more filtering is needed

For qualitative reviews, the aim is to explore variations in concepts rather than produce homogeneous results. Knowing that a need exists for sampling in qualitative synthesis, key questions about how to approach sampling were asked. If we use a sample of eligible papers, can this exclude potentially valuable data? Would selection based on quality criteria risk excluding important contributions from papers that may be of lesser quality?

With these questions in mind, a sampling strategy needs to be chosen. Salla explained that these are similar to those used when conducting primary qualitative studies. Examples are;

  1. Extreme or deviant case sampling
  2. Intensity sampling
  3. Maximum variation (heterogeneity)
  4. Homogeneity
  5. Typical case
  6. Purposeful random sampling
  7. Convenience sampling

What did they do?

For this 2021 review, the authors wanted to highlight that different groups of interventions (i.e., different types of cash transfer approaches) function in different ways e.g. economic based “nudge” theory to modify a particular behaviour versus unconditional cash grants which follow more a human rights approach.

The study group aimed for maximum variation in terms of geographic spread, type of grant used and health conditions. Each eligible study was coded for health conditions, HIV, TB, disability, reproductive and maternal health and mental health. On the other hand, part of this narrowing down process involved careful consideration about what to exclude. In this case, studies exploring “wellbeing and nutrition” were excluded as they didn’t match with other conditions. A difficult but important decision-making process!

From the 128 papers eligible, 43 were included. They covered interventions using conditional, unconditional cash transfers as well as a mixture of both. They also found that using this sampling strategy meant losing data on some countries, but reduced overrepresentation of others - in this case the United Kingdom and South Africa.

Reflections about the sampling strategy used

Dr Atkins reflected on the effects of sampling in her example.

  1. Variation in the concepts:  The researchers wanted equal variety of issues covered. Sampling meant that certain issues were overrepresented, such as cash transfers for disability in the UK, despite a large amount of studies produced on the latter.
  2. Geographical representation gaps were generated. They noted that the whole Eastern Mediterranean region and several countries were excluded as a result of their sampling strategy.
  3. Sampling by quality of papers means potentially missing out valuable contributions by authors of low research capacity.
  4. Superficial results; is this a risk that comes with the inclusion of many geographical regions, even if the number of countries was reduced?

What about sampling papers based on quality assessment?

The importance of assessing quality, Dr Atkins later answered in a question from an attendee, comes from the fact that richness of description helps to build the interpretations used in qualitative evidence synthesis. Her experience suggests that poor quality studies contribute less to the body of evidence in an evidence synthesis than studies of higher quality. Quality assessment is, however, a contested issue in qualitative evidence synthesis. Although checklists such as GRADE-CERQual are used to assess quality, they still require a subjective assessment to be made by the researcher. Consequently, Dr Atkins argued that there is a need for further assessment of what quality assessment really means for the outcome of a review and that further research into the best approaches for sampling studies for inclusion in qualitative evidence syntheses among those eligible is required.

Part 2: Prof. Chris Bonell - Diagrammatic models for synthesising intervention theories of change, applying a meta-ethnographic approach

Professor Bonell presented this interesting and newly forming approach to synthesising evidence of theories of change. A theory of change is a description of why or how an intervention works (or is effective). It explains how change is thought to occur in the short, medium, and long term to achieve an intended (or unintended) impact. Chris began by explaining how systematic reviews traditionally synthesise impact outcomes, and sometimes process evaluation evidence, and that theories of change can also be synthesised.

Extracting information from theories of change in publications means having to explore descriptions with varying degrees of detail. Multiple descriptions of theories of change can be used to get an overarching idea of how a particular category of interventions works. This can then act as a template to develop one’s own intervention as well as inform further analysis e.g. of the mediators and moderators of an intervention.

The example shared by Prof. Bonell was a review of E-health interventions (Meiksin et al 2021) to improve health outcomes for men who have sex with men (MSM).

A synthesis approach based on meta-ethnography

Meta-ethnographical methods were used to approach previous theories of change and provide a clear and systematic picture on how these interventions might work.

Interventions included in the synthesis were found mainly in the form of logic models or text, which in this case were specific and clear. Therefore, Chris and his colleagues decided to compare and synthesise information using a visual approach and by considering the different components of each theory. He highlighted that many of the studies included were recent, and therefore the use of logic models was more widespread, which made the theories easier to analyse. Older descriptions of theories of change that do not include logic models may provide more of a challenge to synthesise.

Three synthesis approaches based on meta-ethnography were used for this review:

  1. Reciprocal translation:
    • Similar concepts across different accounts of theories of change in different reports are brought together as an overarching concept
  2. Refutational synthesis:
    • Contradictions or opposing concepts occurring across theories of change are identified
  3. Line of argument
    • An approach that aims to “piece together” concepts across studies

They produced a diagram for each set of eHealth interventions that fell under a particular behavioural theory and then combined these to form an overarching theory of change in one diagram. The results were then displayed in a delightful set of handwritten penciled diagrams (image below), including activities, mediators, moderators and outcomes. This was an interesting diversion from the hard straight lines of most published reports!

Handwritten penciled diagrams for each set of eHealth interventions including activities, mediators, moderators and outcomes

What they found

The studies were largely informed by different cognitive and behavioural theories which made them easier to categorise, such as the information-motivation-behaviour model and the social cognitive theory model. 

Sometimes, creative ways to conceptualise differences in theories across studies had to be used to synthesise the data. Chris contrasted how the theories of change overlapped but also differed such as in the direction of causal flow regarding how the intervention produces change. For example, the eHealth intervention ‘MyDex’ (Bauermeister et al 2017) theorised that participants’ motivation lead to intention while the intervention ‘The Keep It Up!’  (Mustanski et al 2018) suggested the opposite, i.e. that intention leads to motivation. The authors managed this in their report by treating “motivation/intention” as a joint construct in a diagram, and used text to clarify how conclusions differed for this construct. Overall, going through this process for all eligible studies produced an overarching theory of change eHealth interventions to change health outcomes for MSM.


Prof. Bonell wrapped up the interesting talk with some useful take-home messages:

  1. Diagrams are useful!
  2. Drawing on multiple reports of theories of change allowed for the development of overarching theories which were more nuanced than individual reports
  3. Overarching theory of change combined with synthesis of outcome evaluations provide a good starting point for developing new interventions in a particular area

You can watch the recording of the session here.

Author: Salma Hassan, MSc Public Health Student, LSHTM.

Revision: Laurence Blanchard, Co-lead for the Evidence Synthesis theme, Centre for Evaluation, LSHTM

Tips and tricks for Evidence Synthesis: Session 1

Developing a protocol for an evidence synthesis: lessons from Cochrane and realist reviews

July 26, 2021

This blog post includes subheadings to enable you to refer easily to particular topics or questions of interest, which include useful links.

On June 16, the first seminar of this new series focused on developing a protocol for an evidence synthesis. It hosted panellists Dr Jane Dennis, Editor for the Cochrane Injuries Group at LSHTM who has also worked as a systematic reviewer herself at the University of Bristol, and Daniel Carter, a Research Fellow and PhD candidate in the Department of Public Health, Environments and Society at LSHTM.

Populating the protocol for a systematic review: minimum standards in context – Dr Jane Dennis

This guidance is particularly focused on clinical topics and some elements may be less applicable to those focusing on social programmes or policies. 

Jane began the seminar with a presentation on systematic review protocols drawing from her experience from within and outside of Cochrane.

First, she provided a useful reminder of the benefits of systematic reviews:

  • Their ability to synthesise unmanageable quantities of information: as of June 2021, there were 1,777,773 reports of RCTs in CENTRAL alone!
  • Overcoming the limitations of single studies which rarely provide definitive answers.
  • Two key words stood out: explicit and reproducible (perhaps it is worth making a note of these when writing our MSc projects).

Moving on to protocols, if we were to retain one message it is that a “meta-analysis is optional in a systematic review, a protocol…. is not”.  Dr Dennis used a wonderful analogy to illustrate this concept further: a protocol is “not cement… just a recipe”. A protocol should not just be seen as binding but rather as a means of increasing the reader’s trust and ensuring you are guided by your research question, not the data obtained.

Why publish your protocol?

  • To prove to journals and readers that you have minimised bias in your review process by ensuring transparency, reliability and a comparison with your original plan.
  • To avoid duplication of work by assessing what research is ongoing in your field.

When to register your review or publish your protocol?

  • Before data extraction has started! After this, it may be too late as you may have been biased by seeing your search results.

What to include in your protocol?

RevMan software (on the Cochrane website) contains useful protocol and review templates.

  1. Background
    • Description of the condition: include prevalence and incidence estimates if possible.
    • Description of the intervention: in the context of other interventions.
  2. How the intervention might work: consider (biomechanical) pathways if possible, hypothetical pathways are an alternative.
    • Why it is important to do this review: place your research into context with other work:
      • What unanswered questions may your review help answer?
      • Is there an older review that needs updating?
      • These questions should be repeated again once your review is completed to place your findings in context, highlighting agreements and disagreements with other reviews.
  3. Criteria for considering studies for your review: these questions drive your review design and are structured by the PICO(S) framework (Population, Intervention, Comparison, Outcome and Studies):
    • Type of participants
    • Types of interventions
    • Type of comparators
    • Types of outcome measures: for clinical topics and when working according to GRADE Cochrane ways of working, restrict the ‘summary of findings table’ to seven outcomes. One must be ‘harm’ or unanticipated effects.
    • Type of studies: considering the increase in trial fraud, reviewers now often only include studies pre-registered in a clinical trials register, providing proof of protocol pre-registration and ethics approval.
  4. Search methods: “your review is only as good as the evidence you looked for”.
    • Always search and the WHO International Clinical Trial Registry for evidence of trial pre-registration for clinical topics.
    • Other databases of interest for clinical topics may include:
      • The Cochrane Central Register of Controlled Trials (CENTRAL)
      • Ovid MEDLINE
      • Embase
      • Web of Science
      • Subject specific sources
  5. Study selection plans: 2-3 reviewers should be involved to improve transparency and reduce mistakes (this is not permitted for MSc projects).
  6. Data extraction plans

The following sections should also be completed in a protocol, but are specific to the type of review:

  1. Assessment of biases – see the bibliography at the end of this summary
  2. Evidence synthesis plans
  3. Contribution of authors
  4. Potential conflicts of interest

PRISMA-P is a reporting guideline but can also be a useful guide while developing a protocol.

Where to register your protocol?

How should conflicting findings be managed?

  • Consider a meta-regression. You can rarely attribute differences to one thing.
  • You should not have an opinion about what the findings will be (to minimise bias).

An introduction to realist reviews – Daniel Carter

Daniel provided an introduction to realist reviews, a concept that many of us have heard about but may not have yet fully understood.

Critical realism considers that the way we produce knowledge is mediated by our own experiences (subjectivist epistemology – what we know) and aims to consider the ‘real’ (realist ontology – what exists).

Some may still be a little confused at this point. However, Daniel proceeded to clearly explain that realist reviews aim not only to describe what works, but also to explain why and how an intervention works: “realist reviews should be thought of not as a methodology but rather a theoretical approach to a question”.

How are realist reviews different from classic literature or systematic reviews of effectiveness?

Examples of key realist questions:

  • How and why does it work?
  • For whom does it work?
  • In what context does it work?

Realist reviews do not just aim to link pieces of information to understand effectiveness but have an explanatory aim to generate wisdom.

  • Standard literature reviews do not necessarily aim to understand causal mechanisms.
  • Systematic reviews can be implicitly realist and it may simply be that we do not have the realist vocabulary to explain our realist findings!

What are key concepts of critical realism?

Critical realism is about context. Daniel explained that the observable world should be studied but most importantly, ontological depth, or the observable world’s underlying mechanisms, should also be considered. This is more difficult as these are present “beneath the surface”, as illustrated by the iceberg below.

Illustration explaining key concept of ontological depth
Source: Jagosh, J, 2019

Realist reviews aim to determine CMOCs underlying issues of interest. But what are CMOCs? They represent a combination of:

  • Context (elements in the background allowing the exposure to impact the outcome)
  • Mechanisms (the way in which the outcomes emerge; resources offered through a program and the way people respond to those resources)
  • Outcomes (the observable part of the iceberg, the intended and unintended effects)
  • Configuration

An example of switching on a lightbulb helped illustrate this: the lightbulb is the outcome; the context is the lightbulb being turned on by electricity only if the power grid supplying it is working; the mechanism is the electrical circuit.

What are the steps of a realist review?

Daniel also highlighted that there is no one single way of conducting a realist review, a characteristic that enables them to approach complex problems.

Most steps are similar to a systematic review, as is the need to publish a protocol. The main difference is the philosophical realist lens the reviewer uses, also affecting methodology.

This is illustrated by the initial step of “programme theory” development which helps determine how you expect a particular intervention or exposure or policy to work on a given outcome (CMOCs) and is reviewed iteratively. This is comparable to a logic model or theory of change. Often, several programme theories are hypothesised that pattern together to produce particular outcomes.

Flowchart showing realist reviews


Description automatically generatedHow should conflicting findings be managed?

  • Finding contradictory evidence does not invalidate your review. To the contrary, it can test the initial programme theory, refined via an iterative process.
  • Compare the context of the contradictory findings – this might highlight the role of context and how the intervention should be delivered.

Useful resources:

Realist Reviews

The recording of the session is available here. I definitely recommend taking some time to watch it!

This blog entry was written by Emma Cahuzac - MSc Public Health Student.

The Centre for Evaluation’s Student Liaison Officers would like to thank all panellists for their time, all students for their interest, and Laurence Blanchard and Delia Boccia for organising this seminar series.

Tips and Tricks for Evidence Synthesis: a seminar series on the HOW in evidence synthesis

July 22, 2021 

In this blog, Laurence Blanchard, co-lead for the evidence synthesis theme within the Centre for Evaluation, introduces us to a new seminar series designed to offer useful tips and tricks for anyone synthesising evidence. 

Have you ever wondered HOW some reviewers concretely develop and apply their methods in complex evidence syntheses, and what are their rationale?

This summer, the Centre for Evaluation is organising a short seminar series titled ‘Tips and tricks for evidence synthesis’. Nine experts present their experience of one or more complex evidence syntheses, and explain details that are not usually provided in journal articles. All key classic evidence synthesis ‘steps’ will be covered, from searching the literature to synthesising the results. The series is open to all!

The series was conceived by the three co-leads of the Evidence Synthesis theme: Dr Delia Boccia, Dr Fiona Majorin, and myself, Laurence Blanchard. The idea started with Fiona who wanted to organise an event for students given the popularity of literature reviews for their MSc dissertation/summer projects due to the COVID-19 pandemic. I was interested in learning more about the ‘HOW’ from speakers and Delia suggested to showcase different types of complex evidence syntheses. We hope that this series highlights the benefits, challenges, and fun in doing evidence syntheses to both regular and new reviewers.

The first seminar focused on protocols. Daniel J Carter, Research Fellow & PhD Candidate at LSHTM, and Dr Jane Dennis, Editor of the Cochrane Injuries Group (based at LSHTM) presented their tips for developing protocols for realist and Cochrane reviews, respectively. You can watch the recording here. Additionally, Emma Cahuzac, MSc Public Health student, has summarised the session in a separate blog post on this page.

The second seminar was about qualitative evidence syntheses. Dr Salla Atkins from Tampere University, Finland, presented her strategy for selecting a meaningful sample of studies from a large volume of qualitative research. Professor Chris Bonell, LSHTM, presented an evidence synthesis of theories of change that involved drawing diagrams and using techniques from meta-ethnography. You can watch the recording here.

The third seminar will be held on July 28th and present different evidence syntheses approaches. Dr Silvia Maritano from the University of Turin, Italy, will present tricks for dealing with heterogeneity. Yanaina Chavez-Ugalde, PhD candidate at the University of Bristol, will share her tips for conducting a critical interpretative synthesis. I will present an ongoing evidence map and overview of reviews. For more info, please see the events page.

The last seminar will be on August 11th with Dr Mukdarut Bangpan and Dr Dylan Kneale, both from the EPPI-Centre, UCL. To get more details closer to the time, please register for the Centre for Evaluation’s newsletter or visit the LSHTM Events page.

If you wish to learn about evidence syntheses, you may also be interested in LSHTM’s short course on systematic reviews and meta-analyses or you can contact the Centre for Evaluation at for workshops designed specifically for your team.

Laurence Blanchard

Research Fellow at LSHTM & Co-lead of the Evidence Synthesis theme, Centre for Evaluation

A Day in the Life of an Evaluator

June 15, 2021

The Centre for Evaluation recently held a lunchtime seminar, hosted by the Centre's Student Liaison Officers, to discover more about life as an evaluator. Though the main audience was current LSHTM students, the experience and advice provided by the expert panellists is relevant to anyone seeking a career in evaluation!

With the academic year slowly approaching its end, many students wonder what will follow after their Master’s degree at the School. For those intrigued and excited about the diverse field of evaluation, the Centre for Evaluation’s student liaison officers organised a career panel on May 4th, 2021. Questions ranged from “What skills do I need as an evaluator?“ and “In which fields can I work to conduct evaluations?“ to “Should I pursue a PhD first before working in evaluation?” 

The session started off with an introduction to our panellists: Joanna Busza, Director of the Centre for Evaluation and Associate Professor of Sexual and Reproductive Health at LSHTM; Munshi Sulaiman, Regional Research Lead in Africa at BRAC International and Research Advisor at BIGD; Anne LaFond, Director for the Center for Health Information, M&E at John Snow Inc and Ona McCarthy, Deputy Director of the Centre for Evaluation and Assistant professor at LSHTM.  

Together, the panellists covered a wide range of sectors in the field of public health evaluation, topics and methods of interest as well as different career stages. As displayed by their diverse backgrounds and current sectors, working in “evaluation” can have many different meanings: the scope of work can range from conducting evaluation projects as an academic to working for non-governmental organisations or being hired by private consultancy firms.  

Additionally, often times evaluators not only change sectors sequentially but also mix different positions and contracts concurrently. While some start off with a PhD in evaluation and slowly complement their work with consultancy contracts, others step out of academia after their MSc and learn much of their evaluation skills through direct hands-on practical work. Early in the panel, it became clear that there is no one path leading to a career in evaluation and that life as an evaluator can be diverse and manifold. 

Correspondingly, the set of skills needed to work in the evaluation setting cover a wide range of topic expertise as well as methodological proficiency. Against this backdrop, all panellists agreed that young evaluators need a basic evaluation mindset and framework as well as a general grasp of quantitative or qualitative data analysis. This can be obtained as part of an MSc degree,  stand-alone modules or short courses. Anything beyond the basics can be learnt directly on the job and in the field under the auspices of senior evaluators. 

When asked about the balance between breadth and depth of knowledge and methodological expertise in public health evaluation, the panellists came to the conclusion that a general specialisation in either one public health topic or one methodology is critical in order to clarify the added value of the young evaluator to the team. Yet, a broad interest and initial understanding of other topics and methodologies combined with the flexibility to evolve and learn remains important. 

In addition to the “hard” topical knowledge and methodological expertise, the panellists stressed the importance of “soft” communicative and cross-sector collaborative skills needed for a successful career in evaluation: as an evaluator one is working with different actors around a programme’s funding and implementation, making it essential to be able to work and communicate in diverse teams and foster strong, formidable and long-term partnerships. This holds true especially in communicating with and “evaluating” the implementer’s work. A perceptive and knowledgeable insight into the realities of an implementer is key for successful collaboration and evaluation. 

Another question of importance to the soon-to-be MSc graduates was regarding the necessity to pursue a PhD degree if one is planning for a career in practical evaluation outside of academia. The panellists agreed that one should follow a PhD if one is inherently interested and not merely out of career considerations. Nevertheless, with today’s phenomenon of “degree inflation” a PhD may be helpful in advancing one’s career. Yet, it is by no means an absolute requirement. Similarly, one may start one’s career after the MSc, gain practical knowledge and experience in evaluation before deciding to conduct a PhD at a later career stage. The gained insights into real-life evaluation will help shape and enrich your academic work. The panellists agreed a PhD does offer a great opportunity to focus and deep dive into one part of evaluation for a longer period of time, equipping the student with a very advanced skill set and offering personal development.  

Additionally, for those enjoying academic work, e.g. teaching, academic publishing and generating knowledge on a broader scope, it can be an interesting and enriching path to combine practical evaluation with academic work. This helps in staying at the forefront of state-of-the-art evaluation while ensuring that new theoretical concepts are applied to real-world settings. 

Finally, the panellists’ careers and stories were a living example of the various starting points and paths students can take in their evaluation journey. One should not shy away from applying to open positions and using existing networks to reach out and connect. A good starting point is to familiarise oneself with an evaluator’s past work before contacting them displaying one’s effort and interest in their projects. Sending an email may seem daunting at first, yet can be the start of an exciting opportunity. 

To conclude, the field of evaluation is still growing, very diverse and characterised by a huge demand for skilled junior evaluators. Identify your topic or methodology of interest, invest in obtaining the basic skills and stay open for and adaptable to new challenges. If interested in academia, a PhD may be a good starting point but a career in evaluation does not need specific credentials or modules but rather an evaluation mindset and openness to learn and grow. 

This blog entry was written by Amir Mohsenpour

The Centre for Evaluation’s student-liaison officers would like to thank all panellists for their time and all students for their interest.  

LSHTM and the Centre for Evaluation offer a number of courses, both in-person and online, as well as other events on evaluation. 

“Building the ship while it’s sailing”: the challenge of evaluating programmes that change over time

January 7, 2020

The Centre for Evaluation recently held a lunchtime seminar, in which we heard from five speakers who are undertaking evaluations of interventions that have changed. In this blog, we highlight the key challenges that change poses to traditional evaluation approaches.

There is increasing recognition that a flexible approach to intervention design and implementation is needed to address complex public health issues. Both DfID and USAID are investing in adaptive management in which an intervention and the implementation strategy are anticipated to evolve over time. There is also increasing use of human-centered design (HCD) or design thinking, a flexible and iterative approach, during the development and implementation of programmes. Programmes that change over time pose a challenge to traditional evaluation approaches, which are based on the assumption of a stable, well-defined intervention. This seminar brought together five speakers, each undertaking evaluations of interventions with both intended and unintended changes. The five interventions and the associated changes were:

  • Adolescent 360 aims to increase uptake of modern contraception among girls aged 15-19 in three countries. The programme is using principles of design thinking to develop interventions that are adapted to the local context – as such different approaches are anticipated in each country. The evaluation is being designed and baseline data collection conducted in parallel to intervention design.
  • The Tanzanian national sanitation campaign is an ongoing multi-component intervention that aims to encourage individuals to upgrade their toilet. The intervention is continually evolving and includes mass media campaigns and television shows. The evaluation, led by the EHG group at LSHTM, has just started.
  • The Expanded Quality Management Using Information Power (EQUIP) initiative involved groups of quality improvement teams testing new implementation strategies to increase the coverage of selected essential interventions for maternal and newborn care. At the start of implementation, a defined number of essential intervention were selected. Ultimately, not all of the essential interventions were included. The change happened after the evaluation had started as a response to local context.
  • The Safe Care Saving Lives (SCSL) initiative supported the implementation of 20 evidence-based maternal and newborn care practices. The intervention was intended to be rolled out in three phases with the second and third phase planned to be randomised to allow for a strong evaluation design. However, the implementation strategy was adapted in several steps, and operational needs made the original randomisation unfeasible. Further, the intervention received insufficient support from policy makers and health facilities. As a result, the second phase was never fully implemented and the third phase cancelled. The evaluation now seeks rather to understand why implementation was so difficult in the respective context.
  • The Gombe partnership for maternal and newborn health (MNH) aims to strengthen health services through a package of interventions delivered by four NGOs. The intervention has started to be scaled up, including to comparison facilities, while the IDEAS-led evaluation is ongoing. In addition, components of the intervention have changed in response to emerging learning and changes in the national strategy for MNH.

Key challenges identified in the seminar to undertaking an impact evaluation in the context of change are (1) defining and re-defining the intervention (2) identifying the target population and an appropriate counterfactual; (3) determining outcome measures; and (4) purpose and timing of the evaluation.

Challenge 1: defining and re-defining the intervention

Describing the intervention was often challenging, especially where evaluation decisions are being made before an intervention and its implementation strategy have been finalised. In some cases, the evaluation team were the first to articulate a theory of change, mapping the different components of the intervention and its intended outcomes and impacts.

Both planned and unplanned changes to the intervention over the course of the evaluation were perceived not to be clearly documented and the task of capturing changes fell to the evaluation team, often piecing together retrospectively what had happened. In all cases a close working relationship with the implementer and the use of mixed methods were seen as essential to be able to articulate a theory of change and capture changes in implementation. The Adolescent 360 study includes a process evaluation, and the evaluators have found direct observation of team meetings and programming activities to be the most important method to capture changes as the design process is iterative and the process has not been well documented.

The question was raised during the discussion whether a detailed understanding of the intervention was needed. The response: ‘it depends’! The need to understand the intervention was considered to depend on: whether the desired change is the same in each setting and population sub-group; whether outcomes were anticipated to change over time; and the level of detail needed for the analysis.

Challenge 2: identifying the target population and an appropriate counterfactual

Changes to the implementation strategy had a big impact on the ability to select comparison sites and ultimately the study design.

Changes to the implementation strategy of SCSL meant implementers did not adhere to implementation sites that had been selected through randomisation. The evaluation considered a ‘dose-response’ analysis (measure the strength of the intervention delivered, to explore the association between implementation strength and outcomes). However, it proved challenging to define the ‘dose’ due to the intervention approach not being fully defined at the outset and evolving during implementation. Adolescent 360 plans to strengthen its pre-post analysis with a dose-response analysis. This method was considered, by those present, to warrant further consideration.

For a number of the evaluations it was not clear upfront where the intervention would be implemented. For example, the evaluators of Adolescent 360 considered approaches such as stepped wedge and regression discontinuity designs, but insufficient details on the implementation sites and the target population at the time the evaluation was being designed meant a pre-post cross-sectional analysis was the most appropriate approach.

In the case of the Tanzanian sanitation programme evaluation, the intervention is highly diffuse, different populations are anticipated to be exposed to different components and new components are continually being added. The evaluation team is hoping to follow individuals over time who self-report whether, and which, components of the intervention they have been exposed to, and correlate that with modifications to their sanitation facilities at home.

In the majority of the settings a further challenge identified was that countries have weak routine monitoring systems, which makes selection of retrospective controls challenging. Could a country-wide evaluation platform – as proposed by Victora et al. in 2011Heidkamp 2017Keita et al 2019 and others – in which different existing databases are integrated in a continuous manner, be a potential solution?

Challenge 3: selecting the right outcome measures

As evaluators we seek to understand whether an intervention had the intended outcomes. It is standard practice to publish an evaluation protocol, which specifies the key outcomes and the analysis plans at a study’s outset, ideally before the intervention has been implemented.

Where the evaluation is being designed in parallel to the intervention it can be challenging to identify the relevant outcomes. In the case of the Adolescent 360 project, the evaluation team had to grapple with designing an impact evaluation and undertaking baseline data collection before the intervention was designed. As a result, the baseline survey had to include a more comprehensive set of questions in an attempt to capture multiple potential intervention outcomes.

The EQUIP study worked with the implementation partners as the intervention was being finalised to identify four key outcomes. The study found an impact for only one of the four outcomes. This was, in part, explained by fact that the interventions that related to two of the outcomes were never implemented. This experience raised methodological challenges: should the analysis be based on what was intended at the study outset or should it be changed to reflect what the implementation team actually ended up doing? Ultimately the evaluation stuck to an intention to treat analysis and drew cautious conclusions.

It might be desirable to allow outcome measures to be added or dropped over time. However, a constant set of indicators is vital for understanding whether a programme achieved its intended goals. In his blog, Julien Barr suggests identifying bedrock outcomes, which don’t change over the course of the programme as well as including a basket of output indicators that can change.

Challenge 4: the right evaluation at the right time?

Are we trying to support decision makers to make day-to-day decisions or are we trying to examine the impact and cost-effectiveness of a programme? These were seen as two distinct aims requiring different approaches to evaluation. Evaluations are frequently commissioned for the latter but in practice are frequently needed for the former.

There was discussion during the course of the event that evaluations might need to do more to support implementation. As a result of implementation challenges the evaluation of SCSL placed greater emphasis on understanding why the intervention was not taken up coherently and implementation diluted. A close working relationship with the implementers and funders was seen as pivotal to this shift and ensuring the utility of the evaluation.

This led to questions around whether we are doing impact evaluations too early? Should we be waiting for concepts to have been designed and piloted before measuring impact? Evaluation could first help implementers dynamically ‘crawl the design space’ by simultaneously testing alternative interventions and adapting the programme sequentially before scaling up and evaluating. Taking a stepped approach to evaluation would slow the process down and increase the costs and it was questioned whether there would be support for such an approach from implementers and funders. Alternative options such as ‘hybrid designs’, seek to blend effectiveness and implementation research to enhance the usefulness and policy relevance of clinical research.

Key messages

  • Evaluators should work with implementing partners to document changes, define the intervention at different stages and refine the theory of change accordingly. Mixed method approaches are most suitable to capturing change.
  • Clarify role of the evaluation for the implementers and the donors. Have honest discussions about the most appropriate timing of the evaluation and reassess the evaluation regularly as change occurs.
  • Clearer guidelines are needed on the most appropriate analysis in the context of change.

Further reading:

Watch relevant seminars online:

Weekly links | Week of 1st July

July 5, 2019

Phew – it’s actually hot outside. If like me you’re not sure how to cope when summer finally arrives, perhaps these links will give you something to read in the shade.

  • The hugely ambitious and hotly anticipated popART trial has been in the news (NPR), with an interesting take on the who, the whys, and the what nexts. The trial found that universal test and treat for HIV can reduce population-level incidence of HIV by meaningful levels. 
  • Alicia McCoy offers some candid thoughts on life as an evaluator in an NGO. She touches on something that seems to come up in discussions in the Centre: what is evaluation for? Evaluations, she suggests, are often seen as being about accountability, but should be used for more than that, including learning how to design better interventions next time. That’s more difficult, however.
  • We’re a broad church, in the Centre for Evaluation, but as a rule most of us don’t explicitly use methods from economics and econometrics. Although sometimes suffering from language barriers, there’s a lot to like and learn from the econometrics field. For a sense of what’s out there, David Mckenzie has put together a list of technical topics.
  • Twitter is a total waste to time that makes you feel bad about yourself, right? Wrong! A small corner of Twitter is bucking the trend and engaging in nice, supportive, and informative discussions. Yes, you guessed it, it’s the epidemiologists. Search #epitwitter to pull up posts on causal reasoning, coping with PhD stress, paper writing, and dogs. There’s even a #epibookclub where denizens of Epi Twitter are reading Nancy Krieger’s Epidemiology and the People’s Health together over the next couple of months (only just started). She’s even agreed to answer questions about the book at the end of the summer.

Weekly links | Week of the 17th June

June 21, 2019

Like this start-stop-start(?) summer we’re having in London, we’ve not always been able to throw together the Weekly Links blog post. We’ll try as hard as we can to put one out each Friday, but only when we’re found enough interesting material. Hopefully you’ll agree the five links below were worth the wait:

  • We though that this blog was interesting, on the challenges of evaluating integrated care in UK. Eilís Keeble refers to multiple issues: data not always capturing the same populations, definitions of indicators changing over time, challenge identifying comparison places in a settings where multiple things are going on, etc. Probably sounds familiar to some of you.
  • We’ve recently discovered a website called ‘Changeroo’ for developing theories of change. From the animated video on the homepage, and video of the software being used, it looks promising. Please be in touch if you have used it or are planning to. they also have a ‘ToC-Academy‘ with free tools and resources to develop and refine theories of change.
  • From @fp2p on Twitter: “When will we get a report on your findings?”, reflections on researcher accountability from the DRC by Christian Chiza Kashurha. Lots to think about, including this scene: One day, I was passing back through [a research] community when suddenly I came across two of our former respondents. After some greetings, their words grew blunt: “Manake mulikuyaka tu tupondeya muda na kukamata maoni yetu nanjo muka poteya! Ju mpaka sai hatuya onaka mutu ana kuya tuambiya bili ishiaka wapi.” (“So basically, you just came here to waste our time collecting our opinions – and then that’s that: you disappeared! Because since then, we’ve never had anyone come back to tell us the outcome or results of what you were doing here.”) One of the two was very blunt indeed: “Si mulishaka kula zenu, basi muna weza tu kumbuka siye benye tulitumaka muna pata hizo makuta.” (“Now that you’ve gotten your food [i.e. been paid for your research], couldn’t you at least remember those of us who made that possible for you?”)
  • Speaking of communicating results, Alexander Coppock has produced a paper on visualisation for randomised controlled trials, using R. He’s even published the code for the paper, here.
  • By discussing the example of the Teen Pregnancy Prevention programme in the USA, and the low impact of the interventions, the Straight Talk on Evidence blog touch on a more general issue. They say the low impact was because the method for choosing interventions was not good. They contrast this with a different, more rigorous, approach. How interventions are chosen, and for what places, seems somewhat under researched, despite potentially huge influence over the effects observed.

Weekly links | Week of the 20th May

May 24, 2019

Whoops! Missed a week… sorry. Here are five stories of interest from the evaluation world.

  1. The IFS launched the IFS Deaton Review, with Nobel Laureate Angus Deaton (gettit?). The idea is to bring together people from many disciplines to ‘build a comprehensive understanding of inequalities in the twenty-first century’, as well as ‘to provide solutions’.
  2. However, not everyone was too impressed. 40 researchers criticised the make up of the panel, and Faiza Shaheen gave a more personal take on non-white exclusion (and the terms for inclusion) on panels such as these.
  3. Annette Brown has been looking at the literature on gender bias in grant proposal reviewing — summarising her thoughts here.
  4. With possible implications for other evaluation research, a new paper from IJE concludes that when estimating non-specific effects of vaccines there can be bias from right or let censoring, which needs to be accounted for.
  5. review in JAMA Oncology looked at 143 anticancer drug approvals by the FDA and found that 17% of those approved had ‘suboptimal’ control arms. The implication being that the effectiveness of the drugs were being overestimated. This review emphasises the equal importance of understanding the intervention and the control arm; evaluations too often neglect to describe the control arms in much detail.

As always, please send ideas to

Have a good long weekend!

Implementation Science

May 15, 2019

In the last few years, a new field of ‘implementation science’ has emerged, which focuses on bridging the gap between efficacy and impact. Prof. James Hargreaves, Former Director of the Centre for Evaluation and Professor of Epidemiology and Evaluation at LSHTM, has been invited on different occasions to give a presentation on implementation science in HIV research.

Presentation title Presentation description
Demystifying Implementation Science This introductory video, produced for the ViiV AIDS 2018 Pre-Conference Workshop in Amsterdam, aims to demystify the term 'Implementation Science' and introduces examples of work that address key questions in this area.
What is Implementation Science? This talk, presented at the inaugural Implementation Science Network meeting, which took place at the 9th International AIDS Society Conference in Paris, describes how implementation science is defined in the HIV literature, how implementation science questions should be framed, and methods that could be used to yield rigorous results.
Implementation Science Trials: Do the rules of RCTs apply? This presentation, from the Conference on Retroviruses and Opportunistic Infections (CROI) in Seattle, outlines the aims of implementation science, the rules of Randomised Controlled Trials (RCTs), and identifies four adaptations to the conduct of RCTs that are relevant in the Implementation Science setting.

Weekly links | Week of the 6th May 

May 10, 2019

It’s been a short week here in the UK, but with four seasons in four days it feels like a while since Monday. Here are five evaluation-related links to start the weekend.

  1. Canada’s International Development Research Centre (IDRC) have announced a new tool for assessing evidence that gives more weight to research that accounts for context articulates dimensions of quality. The idea is that this will move away from traditional metrics that preference literature in American and Western European journals and raise the profile of Southern-only research. They published an article on their approach in Nature.
  2. Ultra-poor graduation programmes have been causing a lot of discussion for a while in the development sector. There’s been a lot of good evidence produced through rigorous evaluation. To add some depth to the findings, qualitative research is being published, and is described here on the World Bank’s blog. 
  3. Difference in difference models are special cases of lagged regression — or are they? See blog and comments for discussion.
  4. A (very sad) cautionary tale: warning people not to drink arsenic-contaminated water in Bangladesh may have increased child mortality by 45%; other options were contaminated with human waste. A reminder that interventions have potential to do harm, which should be captured in evaluation.
  5. Using evidence from 16 studies, researchers have found that people will often object to randomisation to see which of two policies is better, even when there is no reason to pick one policy over another. As they say, ‘This experimentation aversion may be an important barrier to evidence-based practice.’

Weekly links | Week of the 29th April

May 3, 2019

Hi again! Last ‘week’ was just three days long because of LSHTM’s generous Easter break, and there wasn’t quite enough time to get through emails, let alone collect five links to share on the blog. This week, however, there’s a lot to have a look at:

  1. First, and most important, is the Centre for Evaluation termly newsletter — check it out here, and have a think about sharing your own work with our members in the next newsletter in a few months.
  2. Taking Twitter to the next level, @statsepi Tweeted a thread that uses simulations to show that when adjusting for covariates in randomised trials it’s not the baseline imbalance that’s important but the degree to which covariates predict the outcome at the end.
  3. Although perhaps a minority in the Centre for Evaluation, epidemiologists have a big influence on how we conduct research at the School. Which is why you might find a collection of think-pieces on the Future of Epidemiology in the American Journal of Epidemiology interesting. There’s one about teaching, which thinks about how we talk about causation in epi teaching, which has resonance with the evaluation field.
  4. Researchers Julian Kolev, Yuly Fuentes-Medel, and Fiona Murray have looked at gender disparities in appraisals of ‘innovative research grant proposals submitted to the Gates Foundation from 2008-2017’. They found that ‘despite blinded review, female applicants receive significantly lower scores, which cannot be explained by reviewer characteristics, proposal topics, or ex-ante measures of applicant quality’. They attribute this to differences in communication styles (and presumably preferences for particular styles on the side of the reviewers).
  5. Finally: would we be more productive in monasteries? While academia used to be associated with religious orders, now our daily lives are far from the quiet introspection and isolation that used to be practiced. Cal Newport wonders if the lengths taken to concentrate fully on spiritual insights were actually necessary to overcome our natural limitations, and that constant email and open-plan offices might be keeping us from work satisfaction.

Weekly Links | Week of the 8th April

April 12, 2019

This week we have a couple of papers from Epidemiology, an intro/thoughts on coding in qualitative research, an invitation to join the School’s R-users group, and more. Please send all suggestions to before 9am on Friday.

  1. Ever the engaging speaker, David Speigelhalter has a recorded talk at LSE called Learning from Data: the art of statistics. Definitely worth a listen when on a coffee break.
  2.  Eleanor Murray and team at Harvard are proposing guidelines for more informative causal inference in pragmatic trials. They are looking for feedback on draft guidelines so be sure to click on the link and send your thoughts.
  3. Sonja Swanson has an engagingly-written piece on the threats of bias when using instrumental variables in Epidemiology, and Sam Harper takes a Bayesian approach to evaluating seat-belt policy and its potential to reduce road deaths (he describes the typical frequentist approach to this problem as ’empirically absurd, given what is already known from prior studies’ — intrigued?)
  4. At the BetterEvaluation blog, Helen Marshall shares some practical insights into coding while doing qualitative research.
  5. The statistical software/coding language R has many benefits for evaluation, such as beautiful charts, interactive maps, reproducible reports, and oh — it’s free. The School now has a flourishing ‘R Users Group’ with over 200 subscribers to the mailing list. They meet once a month and one or two members share features of R. The group is for advanced users and total novices alike — if you’re at all interested, follow the link above and sign up!

Weekly links | Week of the 1st April

April 5, 2019

Here are five blogs or papers that we’ve found interesting this week. Remember — please send us whatever you’re reading so we can share with the Centre members. Have a great weekend!

  1. ‘Development interventions have similarities to medical treatments: if you treat superficial symptoms rather than the underlying pathology, or if you give the wrong medicine, you will not cure the illness.’ This, and much more excellent advice/reminders from Marie Gaarder in a new commentary
  2. An older paper from Penelope Hawe looked in-depth at how a control group in an Australian trial understood being ‘controlled’, and how this might have biased the results (towards no effect).
  3. Synthesis Theme Leader, Kathryn Oliver, has just published an article on the nuances of ‘co-production’ in research, asking: do the costs outweigh the benefits?
  4. More thoughts on the statistical-significance debates. Andrew Gelman wonders why he’s bothering to weigh-in, when this debate has raged on for many decades. He thinks he, and his collaborators, have alternatives to offer.
  5. On the World Bank Impact blog, Markus Goldstein summarises some of the work economists have been doing to understand how ‘edutainment’ can reduce (attitudes about) intimate-partner violence.

Weekly links | Week of the 25th March

March 29, 2019

Another week, another list of evaluation links. If you have anything you think others might be interested in, please send them over to and I’ll include them in the following week:

  1. In a post at the World Bank’s Development Impact blog, Berk Özler takes a skeptical look at modern econometric attempts to create ‘synthetic’ control groups instead of randomly allocated ones.
  2. David Fetterman is interviewed for the BetterEvaluation site, arguing that evaluations should be ‘unboxed’ (skip over the clunky YouTube ‘unboxing videos’ analogy) by empowering the ‘community’. Seems there is a lot going on between the lines here, with reference to a debate going back to 1993!
  3. As Centre members have argued in the recent past, Robert Crease discusses the care needed to ensure that science and evidence has appropriate authority in the days of climate and vaccine denialism.
  4. In addition to the crises of p-values, Brexit, and climate change, the Cochrane Collaboration is having its own internal schisms. Where better to read about academics falling out than in a paper on the matter that ‘begins from the philosophical position that reality is multifaceted and multilayered’? To be continued, I’m sure.
  5. A ‘stakeholder survey’ (that includes you!) is being conducted between 22 March and 05 April (follow this link to take part) regarding the draft update to the MRC guidance on Developing and Evaluating Complex Interventions. The MRC guidance was a big topic of discussion at retreat last year (read a summary here)

Have a good 47 hour weekend, and enjoy the weather!

Weekly links | Week of the 18th March

March 22, 2019

A few weeks have passed since the last update, apologies. Here are four interesting evaluation-related links to finish the week:

  1. Big claims from Caroline Heider, Director General Evaluation at the World Bank Group, about a ‘Copernican’ moment for evaluation methods in international development as the famous ‘DAC’ criteria are reconsidered.
  2. Many 100s of researchers (including some of LSHTM’s leading statisticians) — led by a group including  Modern Epidemiology‘s Sander Greenland — have signed a declaration rejecting statistical significance, although not everyone is convinced that this is how science is supposed to work.
  3. Part of recognising Feminist Issues in Evaluation, the American Evaluation Association published a series of blog posts, including ‘Data is Not Objective: Feminist Data Analysis by Heather Krause’.
  4. new (free) book describes the ‘Qualitative Impact Protocol’, which promises methods to attribute impact without comparison groups.

Reflections on the Biennial Retreat: Places, Spaces, and Contexts

March 18, 2019

The Centre for Evaluation (CfE) held its annual retreat on Thursday, December 6th, focusing on the theme of “Places, Spaces and Contexts.” Held “off campus” in the Hatton Gardens area, the retreat brought CfE members together for a day of discussions, networking, and musings on all and any issues raised regarding evaluation theory and practice at LSHTM.

We started the day with a set of 8 “speed talks” through which CfE members presented ongoing existing evaluation work and described how they accounted for or were challenged by context. For example, interventions developed through “human centred design” by definition will vary in each location where implemented, yet a standardised evaluation is expected across programme sites. The presenters described similar experiences of really needing to “drill down” into what happens at local level through participatory and process-oriented methods. Yet methods themselves need to be carefully matched to the social environment in which they are meant to capture key variables – this was illustrated by a study in which girls’ school attendance was an important outcome measure. While researchers initially thought girls might over-report attendance at school, in reality some girls hid from the researchers and marked themselves absent if they were worried they hadn’t completed study-related tasks, thus potentially leading to under-estimates of their attendance. In all the cases presented, strong formative work and mixed method process evaluations were highlighted as ways to track the realities of intervention delivery on the ground.

Woman presenting at CfE Biennial Retreat

In the next session, three LSHTM researchers gave insights into different “place” related influences on research practice. First, Chris Grundy’s talk, “Why maps matter” grounded health evaluation in physical geography. The diseases and social phenomena we often try to measure have spatial distributions that can deepen understanding of how they work, and rapidly developing technology provides new opportunities (but also new ethical dilemmas) to what can be mapped and visualised. Unfortunately, while interest in the use of GIS,  including open source data and electronic data collection methods increase, funding for ensuring there is a GIS specialist involved in health evaluation projects does not reflect this.

Next, Catherine Pitt gave an overview of “Economic Evaluations of Geographically Targeted Interventions,” highlighting the importance of good costing data to help prioritise use of scarce resources. Yet economic costs are highly context-specific, making it difficult to transfer findings from an economic evaluation in one place to targeted programmes elsewhere. She gave examples of how carefully designed cost modelling could be used to tackle the “transferability challenge” by showing the relative merits of different configurations of health packages.

Finally, Kathryn Oliver talked about the gap between evaluation evidence and its uptake and use by policy makers. In her talk entitled “What Makes Evidence Credible?” she highlighted the way that researchers and policymakers speak different languages and value different types of evidence and styles of persuasion. She illustrated how researchers can sometimes become ‘Rapunzel in the Ivory Tower’, feeling uncomfortable with the idea of making health information more anecdotal, emotional or engaging. Using analysis from her work on policymakers’ use of evidence in decision making, she urged researchers to become better at accepting political structures and processes, and make “professional friendships” to bridge the evidence-policy divide.

Woman presenting at CfE Biennial Retreat

Following a massive and delicious lunch, Cicely Marston and Chris Bonell ensured we didn’t get too sleepy by engaging in a lively debate about the need for structured process evaluations. Although both admitted they agreed more than they disagreed, they encouraged discussion within the group by positing the use of pre-defined process evaluation frameworks (such as those developed by the MRC and realist evaluations) against a less structured and positivist and more participatory approach. Drawing on examples from their own work, Chris and Cicely talked about the merits of defining and identifying constructs such as context, mechanism, and outcomes versus working with communities most affected by interventions shape the nature of the research questions, and highlighted how poorly designed process evaluations can risk collecting too much data that then never gets analysed properly, or superficially conducting qualitative research without the requisite skills for meaningful analysis. This session encouraged discussion across all retreat participants about timing, design, and use of good process evaluations.

Man presenting at CfE Biennial Retreat

In our final session, we hosted guest speaker Rachel Glennerster, Chief Economist at the Department for International Development. Rachel presented her work on the “Generalizability Puzzle” – considering how data on successful interventions from one setting might usefully be applied to successful implementation elsewhere, without the need for conducting expensive and time-consuming randomized controlled trials in every possible context. Based on published work, Dr Glennerster emphasised the need for rigorous evaluation trials to demonstrate “proof of concept” and lead to the development of theories about human behaviour and effectiveness of development programmes and policy, but that these theories need not be tested in every local environment. Instead, successful application of theoretically-driven interventions requires good understanding of  local institutions and social organisation, but general patterns and trends gleaned from global knowledge should be trusted as broadly generalizable. Her proposed “generalisability framework” calls for more exploratory and descriptive research to check whether the conditions for any given theory behind a successful intervention are present in a new context for its implementation, and to gather the necessary data for both refining and then evaluating local intervention design. This final session was moderated by Professor Anne Mills, and was followed by a social reception for more informal discussions and wrapping up the day.

Weekly links | week of the 11th of Feb 

February 15, 2019

Seven days have flown by; here are this week’s links:

1. In a short post, Kylie Hutchinson offers some pithy tips on developing a post-evaluation action plan, with links to resources.

2. Taken from the weekly links on the World Bank’s Development Impact blog, Rachel Glennerster reflects on one year as DFIDs chief economist. Echoing Geoffrey Rose, she notes the importance of effect at scale: ‘it is far better to achieve a 10% improvement for 1 million people than a 50% improvement for 1,000’. This idea reflects the original — epidemiological — use of ‘impact’ (as opposed to effect), which is a function of the effect of an exposure and its prevalence, that has been lost in the modern use in ‘impact evaluations’ used to estimate programme effects.

3. Speaking of effects, Judea Pearl and Dana Mackenzie have a relatively new book out, The Book of Whyreviewed here in the NYT. Pearl and colleagues have spent the last few decades developing a complete language of causal analysis. Writing for computer scientists and mathematicians (as well as public health researcher, sometimes), their published academic work can be quite impenetrable, so it’s exciting to see them produce an introductory book.

4. In a self-reflective blog, three evaluation experts reflect on how equity can be a leading principle in evaluation. There are helpful links to further reading at the end.

5. Finally: can statistics indict President Trump? Maybe: the authors of this blog post use simulations to show that it is unlikely (given their model) that the payments we know were paid to Stormy Daniels didn’t come from the Trump campaign (note the double negative). How is this related to evaluation? Tenuously; but it reminded me of a an interesting paper in AIDS, by Marie-Claude Boily et al., that used mathematical models to investigate the plausibility that observed changes in the prevalence of HIV at antenatal clinics was due to interventions with sex workers in Karnataka, India. Even with ‘optimistic prevention parameters’ their results suggested that the changes couldn’t be entirely due to the sex-worker interventions. I’ve not seen many examples of this kind of plausibility testing with mathematical models.

Have a good weekend!

Weekly links | week of the 4th Feb 2019

February 8, 2019

We’re trying something new: five links to interesting articles, blogs, videos, and other resources about evaluation every Friday, every week.

We hope you’ll learn something; if you come across anything that you would like to share, please send a message to with the title ‘links’ — thanks! Here’s this week’s list:

  1. Authors from Cardiff University, with our own Chris Bonellrethink evaluation with complex systems in mind. With lots of examples, they make practical recommendations, including arguing ‘that … acknowledgment of complexity does not mean that evaluations must be complex, or investigate all facets of complexity’.
  2. Along similar lines, the team at Better Evaluation have a short blog on ‘demystifying systemic thinking’. They cite Professor Thomas Schwandt as saying that our evaluations are happing in ‘post-normal’ times, which is less about straight-forward problem-solving and more about embracing complexity, plurality, democracy, and context responsiveness. The blog refer to a new resource: Inclusive Systemic Evaluation for Gender Equality, Environments and Marginalized voices (ISE4GEMs) that offers ‘offers an alternative way of thinking and planning about evaluation practice and its application to complex (messy/wicked) problems.’
  3. In the journal Epidemiology, authors David Rehkopf and Sanjay Basu explain the synthetic control method for quantitative case-study impact evaluation. The method has been used by CfE member Aurélia Lepine and co-authors to estimate the impact of removing user charges for health care in Zambia.
  4. On the World Bank Impact blog, David Evans reviews The Goldilocks Challenge by Mary Kay Gugerty and Dean Karlan. The book discusses the balance between traditional monitoring and evaluation for NGOs and more recent trends towards impact evaluation.
  5. There’s been a fair amount of discussion of Angus Deaton and Nancy Cartwright’s Social Science and Medicine paper on the limitations of randomised controlled trials for informing policy. The journal published a large number of commentaries on the paper, all of which are interesting to read, although counterintuitively perhaps the best way of getting into the debates is to read the authors’ response to the commentaries first to get a sense of the debate.

That’s it for this week! Please do send anything that you would like to share.

Process Evaluation skills-building workshop, Zambia

January 14, 2019

The Centre for Evaluation is launching a series of skills-building workshops around the world, organised in partnership with our collaborators, and held in locations where LSHTM staff are based or regularly visit. In the first instance, the thematic focus is  Conducting Process EvaluationsThe Centre has developed a short curriculum comprising 3 powerpoint presentations, 3 case studies and related group work exercises – these can be adapted to fit local contexts and interests, and it is expected that local researchers will also present their work and facilitate discussion. We have two sample agendas to help structure 1- or 2-day workshops. Process evaluation theme lead Bernadette Hensen recently delivered a workshop in collaboration with Zambart in Lusaka, Zambia.  

The first Centre for Evaluation supported skills-building workshop on Process Evaluation was held on 6 December 2018 in Lusaka, Zambia. The workshop was run in collaboration with Zambart, a research organisation established in 2004 and LSHTM collaborating partner. To keep costs to a minimum, the workshop capitalised on my being in Lusaka to work with colleagues on a formative research study.

Process Evaulation Zambia
Participants discussing how PE can be integrated into their work.

The workshop saw 12 participants, from Zambart, the University of Zambia, other stakeholders and researchers, come together to build new skills and share experiences of process evaluation. Musonda Simwinga and I ran the workshop, with support from Ginny Bond. In the morning, participants discussed what process evaluation is, why process evaluations are useful and when they can be used. We also discussed evaluation frameworks, including logic models, as useful models to map intervention pathways and guide process evaluation design. Zambart has extensive experience running large cluster-randomised trials, including ZAMSTAR (Zambia and South Africa TB Reduction Study) and the HPTN (071) PopART (Population Effects of Antiretroviral Therapy to Reduce HIV Transmission) trial. In these trials, rich quantitative and qualitative data was collected alongside data to evaluate the impact of the interventions on primary outcomes. These data were collected to understand, for example, the effect stigma had on participants engagement with the intervention, intervention acceptability and the important relationship lay counsellors have established with participants including the influence this has on introducing innovations, such as HIV self-testing. We discussed how much of the research conducted by Zambart within trials is implicitly about process, and that making this more explicit would provide a guiding framework for exploring assumptions about how interventions works and valuable information for scaling-up interventions that are successful and how to modify interventions where there is no impact. It would also ensure a more genuine trans-disciplinary engagement. Two participants were experienced in realist evaluation, including Chama Mulubwa, manager for a series of HIV self-testing case studies called STAR and research degree student at UMEA University and Dr. Joseph Zulu, Assistant Dean at the School of Public Health, University of Zambia. The participants discussed the overlap between realist and process evaluation, and where these two fields were distinct.

After discussing process evaluation more generally, we discussed indicators and tools. In this discussion, process evaluation was seen as heavily quantitative, particularly when measuring intervention implementation and reach. We discussed where along a logic model and how qualitative concepts and data collection tools can complement quantitative data collection. In this session, Mwelwa Phiri, a prospective LSHTM research degree student, presented her ideas for a process evaluation to be embedded within a trial of sexual and reproductive health services for adolescents and young people in Lusaka. The workshop ended with a discussion about when and how to analyse data arising from a process evaluation, and how process evaluation could be embedded more explicitly in our work.

Having worked with Zambart since 2009, this workshop was the first workshop I’ve organised with Zambart. It was a great opportunity to discuss thinking related to this concept and debate how it aligned with work already ongoing at Zambart, but also with related concepts such as Monitoring & Evaluation and Realist Evaluation, and whether qualitative data could be defined as “routine”. The workshop was stimulating and engaging, albeit a bit rushed with the workshop held over one day. We also wished that more colleagues from the School of Public Health at the University of Zambia could have attended. We are hopeful that the workshop may lead to a similar seminar or workshop being held at the University of Zambia, and will expand to include a workshop on quantifying impact using quasi-experimental designs. We also hope to carry out a process evaluation on a joint proposal and grant.

Similar workshops are being planned for Ethiopia, South Africa, Zimbabwe and beyond. These are designed to be low-cost, taking advantage of existing LSHTM presence and travel, and sharing logistical costs with local partners. If you are interested in organising a Process Evaluation workshop (or designing a different skills-building event) please get in touch (!

HSR 2018

October 3, 2018

Members of the Centre attended and presented at the Health Systems Conference in Liverpool in October. The focus was on advancing health systems for all in the sustainable development goals (SDGs) era. The Centre produced a conference the map  to highlight key sessions on quantifying impact, understanding implementation and evidence synthesis. The map includes session from different organisation and across different health topics.

Health Systems Conference in Liverpool map

We pick out some key publications from the week:

Alliance for Health Policy and Systems Research’s new methods guide for synthesizing evidence from health policy and systems research (HPSR) to support health policy-making and health systems strengthening.

Alliance for Health Policy and Systems Research’s health policy analysis reader: The politics of policy change in low- and middle-income countries.

Perez MC (2018) Comparison of registered and published intervention fidelity assessment in cluster randomised trials of public health interventions in low- and middle-income countries: systematic review.

Rowe A K (2018) Effectiveness of strategies to improve health-care provider practices in low-income and middle-income countries: a systematic review

Theobald S (2018) Implementation research: new imperatives and opportunities in global health.

The Lancet Global Health Commission on High Quality Health Systems in the SDG era.

USAID Developmental Evaluations

Complex Interventions? Insights through Process Evaluation 

March 9, 2018

By: Queena Luu

LSHTM MSc Public Health, Health Service Research, student

Public health deals with a variety of interventions that interact not only with the healthcare system but the broader social, political, and economic contexts. Thus, large scale health interventions can be inherently complex with a range of components, outcomes, and stakeholders involved.

Assessing the effectiveness of interventions directs attention to measurements of outcomes. Process evaluation expands on such findings by answering questions beyond ‘does this intervention work?’ to ‘what components of the intervention lead to the results?’ and, ‘how implementation strategies affect outcome measures?’

Dr. Stefanie Dringus from LSHTM gave a one-day training on operationalizing process evaluation. Attendees were introduced to the key domains of process evaluations: building strategies to address context, implementation, and mechanisms of impact. The training led to discussions about the degree to which interventions are adapted to the local context and how focus groups may reveal what aspects (‘active ingredients’) of multi-component interventions were most impactful for participants.

After reviewing the theory, attendees were given the chance to simulate the development of a process evaluation for a hypothetical community role model-based intervention. The intervention had the goal of reducing risk factors associated with sexual behavior among young people. In small groups, attendees developed logic models and discussed methods to evaluate the process through which the intervention was implemented.

Groups brought up various issues associated with measuring primary and secondary outcomes because the intervention had many seminar-based components. My group also touched upon the need to use mix methods to understand the mechanisms of implementation including using focus groups and in-depth interviews to understand how receptive young people were to coach-led discussions that addressed sensitive topics.  We also discussed how the coverage of the intervention, and how data is collected, can influence the outcome measures.

After reconvening, various groups presented their logic models. Each group had approached the process evaluation design from different angles – from constructing detailed tables for each of the research domains to using highlighters to cross reference different research questions and their associated process evaluation methods that need to be explored.

At the conclusion of the workshop, there was a rich discussion on how the roll-out and data collection for intervention studies need to be placed in the local context. Complex interventions are embedded into various layers of the community, and it is important to be critical about what questionnaires are administered, how rapport affects responses, and how the process evaluation team, the outcome evaluation team, and the community are communicating with each other.

Process evaluation is one key component into understanding how and why interventions work – this is especially important given the increasingly complex interventions that are being implemented to address determinants of health.

Blog reports on a student workshop on process evaluation organized by the Centre for Evaluation’s. Read more about process here.

Data II Action – Beginning of the road. Putting data to work means saving lives

February 8, 2018

By: Yasmin Hussain Al-Haboubi

LSHTM MSc Global Mental Health Student

Population Service International (PSI) carries out their health interventions much like a fortune 500 company would go about their business. Whilst this may seem counterintuitive to the compassionate nature of health, it is an approach that works well.

“Compassion without knowledge is ineffective”; Frederik Weisskopf’s 1998 quote still stands today, despite our methods of knowledge appraisal having jolted significantly over the past 10 years.

On 1st February, Christina Luissiana connected via skype to LSHTM. Her clear, engaging talk took the room through PSI’s approach to the use of routine data to evaluate health interventions. PSI were early adopters of the latest health informatic system. The District Health Information Software (DHIS2) is an open source, multi-platform system created by the University of Oslo that enables governments and NGO’s to collect and analyse health intervention data. It is provided without a licensing fee, meaning 47 countries currently use it as their main health management system. PSI’s ‘data to action’ approach aims to inform Ministries of Health of the appropriate policy recommendations based on real time data. Christina presented a Malaria Case Surveillance case study from the Greater Mekong Subregion, where a visually clear and user-friendly app has been developed to track malaria cases in non-public health facilities.

Once notified of a potential malaria case, the data collector responds (smartphone in hand) to the health centre. The app collects seven items of patient information; age, sex, malaria test result, if they had received treatment, travel history, occupation (at risk occupations were noted) and phone number. Once the data is generated and uploaded to DHIS2 dashboard, it serves two purposes:

  1. Mapping malaria cases and helping identify ‘hot spots’. In the Greater Mekong Subregion the highest number of potential cases have been found close to the borders. From such geographic mapping, governments have the possibility to make informed decisions about the supply and distribution of anti-malarial drugs. This acts as a potentially cost-effective measure to LMIC Health Ministries. This has yet to be monitored fully but could be a good scope for future research.
  2. Generation of information in real time can be used to identify emerging outbreaks.

This proves a key feature of DHIS2, there is a balance between customisation and standardisation. The interoperability of the app means it can overlay other platforms and software systems, particularly systems that link back to each Ministry of Health.

The rapidity of the data analytics means that there is scope for DHIS2 to be used in humanitarian emergencies and disease epidemics. PSI do not do so themselves, however Christina pointed to the use of DHIS2 in Madagascar, in fighting seasonal epidemics of the plague (bubonic, septacemic and pneumonic). According to WHO, 95% of those in contact with the plague, as identified by DHIS2, were provided with antibiotics.

Following the presentation there was a lively debate surrounding DHIS2 use ranging from the pragmatic to the ethical.

Logistically, in LMIC the reality remains that remote and rural areas will often have problems with both mobile connectivity and electricity. Both being paramount to the collection of raw data. To address this the University of Oslo’s web architects and configurators created a platform wherein the app can be used offline in low connectivity areas, and when reconnected will upload the information to the platform. And if all else fails, pen and paper have served people well for generations.

However, there were questions raised by the audience surrounding how PSI works to ensure ownership of data for the Ministries themselves. This was proven much harder to answer than the logistic questions.

It was suggested that potentially Ministries of Health need to be persuaded to want to use data. Public sector workers in LMIC’s are already overburdened, so they may not want to engage in data visualisations, as this is another layer to their already heavy workload. To combat this, and make health intervention data engaging and interesting, PSI have taken lessons from social media – #majorkey. In the dashboard, people can have conversations about the data, they can also ‘tag’ others. This capacity building tactic has seen slow but certain progress in data engagement and interest.

PSI are working to push data out of their DHIS2 systems, to national health systems. This brings the country back into focus, rather than PSI itself. The contextual framework of the nation’s health system should always be taken into account. Action looks different in different settings, and evidence-based healthcare should reflect this.

Blog reports on the Centre for Evaluation’s Student Seminar, Delivering evidence-based health interventions. Population Services International’s (PSI) unique approach to developmenton 1st February 2018.

Monitoring and evaluation: an insider’s guide to the skills you’ll need

November 2, 2016

In international development, everyone knows that good intentions are simply not enough. It is critical to agree on appropriate aims and then make sure that these can be achieved efficiently.

There are several different ways to achieve development goals. Take malaria, for example: approaches might include investing in vector control (reducing numbers of malaria-carrying mosquitoes); ensuring that people can access bednets; providing education on how to avoid contracting the disease; making chemoprophylaxis (prevention medication) more accessible; or treating malaria cases with better drugs, to name just a few.

We know that some ways of dealing with development challenges, such as malaria, will be more successful than others. Some approaches will have unintended consequences, they will vary in cost and will work in certain places but not in others. So how can those designing interventions decide which approaches to choose?

This is where evaluation studies come in: they aim to help development actors make the best choices. Evaluations can be used to improve programmes as they roll out and/or can try to estimate whether and how particular aims were achieved and whether this was better and more cost effective than other courses of action.

In order to design, run or interpret evaluations, budding development professionals need an understanding of the following.

1. Research study design, outcome measurement and statistical methods

Development programmes are often complex, but this does not mean that scientific methods such as experiments and careful analysis can’t aid a better understanding of whether programmes achieve their desired impact.

2. Social science methods

Development interventions depends on the complex interaction of multiple stakeholders and institutions. People’s goals and incentives differ, power is exercised and resisted in myriad ways, and choices are constrained by poverty or gender inequalities. Social science methods are required to make sense of these complexities to enable more effective implementation.

3. Cost-benefit analysis

When deciding how best to allocate limited resources, those designing interventions must be able to estimate the costs as well as the consequences of different programmes to ensure they get value for money. Cost-benefit analysis can also be used to compare programmes across different sectors, for instance, comparing heath and education interventions.

4. Evidence-based decision making

Understanding what is already known is essential to avoid duplication. Synthesising evidence means pulling together all that has been said about a subject, making judgments about what bits of information are most useful, summarising this evidence, and planning new studies that focus on the most important contributions.

Teams of development professionals will need all these skills to varying degrees. For instance, evaluation experts need to be able to design and implement evaluation studies, while programme managers offer the best perspective on what interventions may be feasible and need to know how to commission and interpret evaluations.

But it is not only development workers who need evaluation skills. Evaluation is about accountability, identifying waste and avoiding harmful effects, and so these skills will also be essential to enable civil society, democratic representatives, and government officials to hold NGOs and other development actors to account.

Where can you learn these skills?

Over the past few years, evaluation courses have mushroomed in institutions all over the world, ranging from full degrees to short courses, face-to-face or via distance learning, at various levels of difficulty. Some examples are listed below.

Evaluation skills are also developed and championed within organisations through on-the-job and peer-to-peer learning. It is great to see growing commitment within international development organisations and donor agencies to developing key evaluation skills for their staff. After all, as management consultant Peter Drucker said: “What gets measured gets managed”, and development matters too much to not be properly managed.

Some examples of training courses in impact evaluation – the list is not exhaustive.

Short courses

Degree programmes

Conferences and seminars

Lively Discussions and Engaging Ideas at the LIDC and The Guardian Debate on Aid

November 8, 2016

On Thursday 27th October the first debate, organized by the London International Development Centre and The Guardian, of the Development Debate Series took place discussing the theme of aid and asking- are we getting aid right?