Dr Karla Diaz-Ordaz


Associate Professor
of Biostatistics


Keppel Street
United Kingdom

020 7927 2065

My primary methodological research area is causal machine learning motivated by high-dimensional electronic health records and genomics data.

My work on treatment effect heterogeneity and optimal treatment regimes is funded through a Wellcome Trust-Royal Society Sir Henry Dale Fellowship (2020-2025).

I am also co-lead in a collaborative research project "Developping statistical machine learning methods for Clinical Trials" based at  the Alan Turing Institute.

Previously, I held a Wellcome Trust-LSHTM Institutional Support Fellowship (2018-2019), an MRC Career Development Award in Biostatistics (2014-2018) and an NIHR Methods Fellowship (2009-2012).

In 2017, I was a visiting scholar to Prof. Mark van der Laan's Computational Biology and Causality group at the School of Public Health, University of California, Berkeley.


I have a PhD in Mathematics from Imperial College London and a MSc in Medical Statistics from the London School of Hygiene and Tropical Medicine.

PhD students:

Oliver Hines (2018- ): MRC London Intercollegiate Doctoral scholarship studying double-robust methods with machine learning in high-dimensional data, with applications to cardio-genetics mediation (joint with Prof Stijn Vansteelandt)

Former PhD students:

Dr Schadrac Agbla (awarded 2019): Instrumental Variable methods for adjusting for nonadherence in cluster randomised trials (joint with Prof Bianca DeStavola). 

Dr Anower Hossain (awarded 2017):  Missing data methodology for cluster randomised trials (joint with Dr Jonathan Bartlett).



Department of Medical Statistics
Faculty of Epidemiology and Population Health


Centre for Statistical Methodology


I am the co-organiser of the School's short course Causal Inference in Epidemiology: Recent Methodological Developments 

I am also the teach on the Advanced Statistical Methods (Causal Inference) sub-module in the MSc in Medical Statistics.

I am also a co-organiser of the Machine Learning Module in the MSc in Health Data Science.


My current work involves doubly-robust estimators paired with the machine learning estimation of the nuisance parameters (e.g. Super Learner estimation). Examples of these are Targeted Minimum Loss estimators (TMLE) and g-estimators with machine learning. These methods are very promising to study causal effects using big data. This is in collaboration with Prof Stijn Vansteelandt (U Ghent).

I am also a co-Principal investigator (together with Prof Chris Holmes) on a project scoping the uses of machine learning in clinical trials, at the Alan Turing Institute.

Previous work involved developing methods for estimating causal treatment effect when there is departures from protocol in a randomised trial (i.e. non-compliance and missing data) using Multiple Imputation (collaboration with Prof James Carpenter). I have also worked in  extending methods for cost-effectiveness analysis, accounting for the bivariate nature of the endpoints (with Prof Richard Grieve).

 Some of my methodological work code can be found in my GitHub page.

I am a member of the Centre for Statistical Methodology, and one of the co-ordinators of the missing data and causal inference themes.

Research Area
Clinical trials
Complex interventions
Economic evaluation
Statistical methods
Bayesian Analysis
Electronic health records
Health economics

Selected Publications

Living risk prediction algorithm (QCOVID) for risk of hospital admission and mortality from coronavirus 19 in adults: national derivation and validation cohort study.
Clift AK; Coupland CAC; Keogh RH; Diaz-Ordaz K; Williamson E; Harrison EM; Hayward A; Hemingway H; Horby P; Mehta N
BMJ (Clinical research ed.)
Causal graphs for the analysis of genetic cohort data.
Hines O; Diaz-Ordaz K; Vansteelandt S; Jamshidi Y
Informative presence and observation in routine health data: A review of methodology for clinical risk prediction.
Sisk R; Lin L; Sperrin M; Barrett JK; Tom B; Diaz-Ordaz K; Peek N; Martin GP
Journal of the American Medical Informatics Association
Estimating cluster-level local average treatment effects in cluster randomised trials with non-adherence.
Agbla SC; De Stavola B; DiazOrdaz K
Statistical methods in medical research
Local average treatment effects estimation via substantive model compatible multiple imputation.
DiazOrdaz K; Carpenter J
Biometrical journal. Biometrische Zeitschrift
Using Animation to Self-Report Health: A Randomized Experiment with Children.
Guerriero C; Jaume NA; Diaz-Ordaz K; Brown KL; Wray J; Ashworth J; Abbiss M; Cairns J
The patient
Domains of transmission and association of community, school, and household sanitation with soil-transmitted helminth infections among children in coastal Kenya.
Oswald WE; Halliday KE; Mcharo C; Witek-McManus S; Kepha S; Gichuki PM; Cano J; Diaz-Ordaz K; Allen E; Mwandawiro CS
PLOS Neglected Tropical Diseases
Covariate adjustment in individually randomised trials
Williamson E; Leyrat C; Diaz-Ordaz K
See more Publications