I am primarily interested in causal inference methodology, dealing with confounding and missing data, in both clinical trials and observational studies. I am particularly interested in how machine learning can be used to obtain valid causal inferences.I currently hold a Wellcome LSHTM Fellowship (focusing on valid causal inference methods after using machine learning for variable selection). I have previously held an MRC Career Development Award in Biostatistics and an NIHR Methods Fellowship in Medical Statistics.
I have a PhD in Mathematics from Imperial College London and a MSc in Medical Statistics from the London School of Hygiene and Tropical Medicine.
In 2017, I was a visiting scholar to Prof. Mark van der Laan's Computational Biology and Causality group at the School of Public Health, University of California, Berkeley.
I currently have a 3rd year PhD student (methods for adjusting for nonadherence in cluster randomised trials). Previous PhD students worked on missing data methodology for cluster randomised trials.
I am the co-organiser of the School's short courses Statistical Analysis with Missing Data using Multiple Imputation and Inverse Probability Weighting and Causal Inference in Epidemiology: Recent Methodological Developments
I am also the organiser of the Intro to Bayesian Statistics and Causal Inference sub-modules in the MSc in Medical Statistics.
My current work involves doubly-robust estimators, in particular, those that can be paired with the use of ensemble machine learning methods (e.g. Super Learner). Examples of these are Targeted Minimum Loss estimators (TMLE). These methods are very promising to study causal effects using big data. This is in collaboration with Prof Stijn Vansteelandt and Rhian Daniel (Cardiff).
I am also working in developing methods for estimating causal treatment effect when there is departures from protocol in a randomised trial (i.e. non-compliance and missing data) using Multiple Imputation. This work is in collaboration with Prof James Carpenter.I have also worked in methods to analyse missing clustered data. Currently, I study and compare the performance of different multiple imputation techniques to handle whole clusters non-response (empty clusters). I am also interested in extending such methods for cost-effectiveness analysis, accounting for the bivariate nature of the endpoints. Some of my methodological work code can be found in my GitHub page.
I am a member of the Centre for Statistical Methodology, and one of the co-ordinators of the missing data and causal inference themes.