Why polio eradication efforts should embrace data science and statistics

Real-time prediction model of cVDPV2 outbreaks could aid outbreak response vaccination strategies
Caption: Vaccination against polio in Pakistan. Credit: Sanofi Pasteur

It’s more than 10 years since I first presented a statistical model to predict which countries in Africa were most at risk of a polio outbreak. Since then, our research collaboration has evolved and adapted the tool to meet current needs. With wild polio eliminated from Africa, the major concern in this region is tackling outbreaks of circulating vaccine-derived poliovirus serotype 2 (cVDPV2), which can cause permanent paralysis.

Although better statistical methods won’t solve all the challenges around eradication, our new paper highlights their potential power and acts as a reminder of why it’s so important to have effective ways of monitoring and mapping real-time risk to target interventions.

In 2010 I was invited by the WHO to present my research at the Second Annual Regional Conference on Immunization, Ouagadougou, Burkina Faso. In my new post-doc position at Imperial College London I had developed a statistical model that could be used to predict which countries in Africa were most at-risk of experiencing an outbreak of poliomyelitis. At the time, outbreaks were occurring at rapid frequency and vaccination activities were not keeping up. Our idea was that statistical modelling could be used to support preventative actions and provide a more informed perspective of what was driving these hugely disruptive epidemics.

The aim of presenting this work was to illustrate that statistics and data could be used in this way and develop relationships with others working in the field. I can distinctly recall seeing the WHO field epidemiologists swooping in to the hotel, jumping out of their 4x4 vehicles and settling down for a few days of the classic conference experience. It was here that I met for the first time several epidemiologists working in polio including Dr Sam Okiror (who at the time was one of the outbreak epidemiologists for polio), who voiced real interest in using models to inform outbreak response. I came away feeling enthused and with a vison that real-time mapping of outbreak risk would be instrumental in delivering polio eradication.

Roll forward over 10 years, and I’m really pleased to see how this vision has evolved. As part of a special edition of Vaccine , we report on a collaborative effort where a very detailed model of poliomyelitis risk is used to provide information on which districts in Africa are most at risk of cVDPVs2. In the past two years this model has been developed in a collaboration between LSHTM, the Institute of Disease Modelling (IDM), the Bill & Melinda Gates Foundation (BMGF) and the WHO.

The work, funded by the BMGF, involves generating a risk map for the AFRO and EMRO regions with data on virus surveillance and the risk updated on a daily basis. Field epidemiologists then use the risk maps to help design the scope of outbreak response, making use of their wide experience and knowledge. In the intervening decade, the modelling approach has been used at a national level to predict polio outbreaks in Africa and prioritise preventive campaigns (wild poliovirus was officially eliminated in the African region last year). The process has benefited from several methodological developments:

  • Surveillance for poliovirus has improved considerably across Africa where more suspected cases are reported to the system and environmental surveillance is now routinely used in many countries, meaning that our ability to detect polio is better than ever.
  • The data processing is now automated (no more excel spreadsheets and spelling errors!); this was achieved through a large investment in computation and skills development in data science.
  • The availability of real-time data has enabled improvements in the modelling; there are sufficient data to estimate population immunity within a district and for a given month and many more important data (such as virus data, estimates of routine immunisation, reliable population numbers) that are now included in a model to estimate risk.
  • This model is sufficiently flexible to enable forecasts 1-month ahead through to 6-months ahead, which helps to align with the timing of campaigns.

On a personal level, I’ve also been fortunate to work with many colleagues including at IDM, BMGF, WHO, CDC and Imperial College London. I was awarded a Medical Research Council fellowship to develop my skills in statistical methods, and applied much of these new skills to problems in polio, moving to LSHTM to develop my career. Working on one pathogen for over 10 years ­– complex as it is – has given me experiences in seeing how scientific research intersects with a real-world public health challenge, and where individual responses and actions can facilitate change, largely (and fortunately) for good.

Better statistical methods won’t solve all of the problems of polio eradication but I’m hopeful for the future. Poor surveillance remains an issue, especially in conflict areas, but routine use of genetic sequencing has helped epidemiologists understand the extent of missing cases. cVDPV2s are a big issue and hopes are pinned on the novel OPV to halt cVDPV2 transmission and minimise new emergences. Vaccination coverage continues to beset polio eradication, but work continues to find innovative ways to improve coverage. Political commitment and community engagement has been difficult at times including the recent reduced commitment by the UK government to the global goal of polio eradication. However, there have been worse times in polio eradication, and I’m confident that as a collaborative global community partnership within polio eradication we can overcome the challenges with renewed enthusiasm for the coming years. For example, the Global Polio Eradication Initiative launched its Strategic Plan for 2022 and beyond with increased vigour, especially for integration with other health services which I think will be essential.

As far as statistical methods go, there are still ideas that I’d like to develop, but essential to this will be to encourage others in polio eradication to embrace data science and statistics. This has already happened in the last 10 years, as shown by many of the articles also published in the Vaccine special edition. Polio may not have been eradicated yet but the journey has been a really valuable one. I urge all those with interests in polio eradication to keep up the momentum.


Arend Voorman, Kathleen O'Reilly, Hil Lyons, Ajay Kumar Goel, Kebba Touray, Samuel Okiror. Real-time prediction model of cVDPV2 outbreaks to aid outbreak response vaccination strategies. Vaccine. DOI: 10.1016/j.vaccine.2021.08.064




COVID-19 Response Fund

There cannot be any complacency as to the need for global action.

With your help, we can plug critical gaps in the understanding of COVID-19. This will support global response efforts and help to save lives around the world.