Close

Seminar recordings

Seminar slides and recordings

Slides and recordings from CSM events going back two years are available below. If you experience any issues with the downloads/links on this page please e-mail: csm@lshtm.ac.uk.


Centre for Statistical Methodology Lecture: “Targeted learning: The bridge from machine learning to statistical and causal inference”

4 March 2020

Speaker: Prof Mark van der Laan (University of California Berkeley)

Slides

Abstract: Society is drowning in data and the current practice of learning from data is to apply traditional statistical methods that are too simplistic, arbitrarily chosen, and subject to manipulation. Nonetheless, these methods inform policy and science, affecting our sense of reality and judgements. This talk exposes this deceptive practice, and presents a solution — a principled and reproducible approach, termed targeted learning, for generating actionable and truthful information from complex, real-world data. This approach unifies causal inference, machine learning and deep statistical theory to answer causal questions with statistical confidence.

This is a public lecture, intended for academics from several disciplines and those interested in the role of causal inference in machine learning. The audience will hear about the historical developments that led to the recent "marriage" of causality and machine learning, and then specifically about targeted learning.


Centre for Statistical Methodology Seminar

Title: Satellite-based machine learning models to estimate high-resolution environmental exposures across the UK.

Rochelle Schneider and Antonio Gasparrini (LSHTM)

Slides

Abstract: 

Air pollution is a public health concern, especially fine particulate matter (PM2.5). Both long- and short-term PM2.5 exposures are associated with adverse health outcomes (such as increased mortality and morbidity). Epidemiological assessment often rely on measurements from monitoring networks, which however are geographically sparse and mostly located in major cities. Novel big data data resources, such as aerosol optical depth (AOD) measurement from satellite instruments, offer a wide spatio-temporal coverage and can address limitations of traditional exposure methods. 

In this talk, we present satellite-based machine learning models to reconstruct levels of PM2.5 at high spatial and temporal resolution in Great Britain within the period 2003-2018. The model combines earth observation satellite measurements with multiple resources, including station data, climate and atmospheric models, traffic data, land-cover, and other geospatial features. The model then rely on a multi-stage random forest algorithm to predict PM2.5 concentrations at various temporal (daily to yearly) and spatial (1km to 100m) resolution. Such exposure data can be liked to small-area or individual-level health databases to perform country-wide epidemiological analyses on the health risks associated to air pollution.  


Centre for Statistical Methodology Workshop: “Methods in Integrative Genomics” 

Speakers:

  • Manuela Zucknick (University of Oslo): Multivariate structured Bayesian variable selection for treatment prediction in pharmacogenomic screens Slides (.pdf, 3.4 MB)
  • Ernest Diez Benavente (LSHTM): A molecular barcode to inform the geographical origin and transmission dynamics of Plasmodium vivax malaria Slides (.pdf, 1.3 MB)
  • Ricard Argelaguet (European Bioinformatics Institute): MOFA: a principled framework for the unsupervised integration of multi-omics data Slides (.pdf 1.0 MB). 
  • Paul Kirk (MRC Biostatistics Unit, Cambridge): Integrative clustering approaches for multi-omics datasets Slides (.pdf 8 MB). 

Centre for Statistical Methodology Seminar

Title: Statistical methods for cost-effectiveness analysis: a personal history.

Andy Briggs (LSHTM)

Health Economics Theme

Slides

Abstract: The last 25 years have seen a large increase in the contribution that health economic analysis has made in national and international decisions about health care provision. Andy Briggs has been working at the interface between medical statistics and health economics throughout this period.  In this talk he gives a personal history of that journey with an emphasis on how statistical thinking has improved the methods of health economic evaluation over that period.  Looking to the future, there remains much potential for statistical methods to continue to improve the way in which we evaluate the cost-effectiveness of health care interventions and to improve health care decision making as a result.


Centre for Statistical Methodology Seminar

Title: Adjusting for selection bias due to missing data in electronic health records-based research.

Tanayott Thaweethai (Harvard School of Public Health)

Slides and audio

Abstract: The widespread adoption of electronic health records (EHR) over the last decade has resulted in an explosion of data available to researchers, which has transformed the landscape of observational research. Since EHR are not collected for research purposes, observational studies using EHR are particularly susceptible to issues of missing data. I present a scalable method that considers a modularization of the data provenance, which entails breaking down the path to observing ‘complete’ data in the EHR into a sequence of decisions or events. Following modularization, the analyst has the flexibility to model each ‘step’ along the sequence individually using inverse probability weighting (IPW) or multiple imputation (MI). In some settings, this approach can even handle data suspected to be missing not at random. I establish the asymptotic properties of an estimator that combines IPW with MI, finding that Rubin’s standard combining rules can be substantially biased under certain conditions. I applied this approach to two settings: first, to address missing baseline and follow-up BMI in a study of bariatric surgery among patients with renal impairment, and second, to address missing eligibility criteria in a single-arm clinical trial where a synthetic control arm is built from patient EHR data.


Centre for Statistical Methodology Seminar

Title: Assessing causal effects in the presence of treatment switching through principal stratification.

Fabrizia Mealli (University of Florence)

Causal inference Theme

Slides and audio

Abstract: Consider clinical trials focusing on survival outcomes for patients suffering from Acquired Immune Deficiency Syndrome (AIDS)-related illnesses or particularly painful cancers in advanced stages. These trials often allow patients in the control arm to switch to the treatment arm if their physical conditions are worse than certain tolerance levels. The Intention-To-Treat analysis compares groups formed by randomization regardless of the treatment actually received. Although it provides valid causal estimates of the effect of assignment, it does not measure the effect of the actual receipt of the treatment and ignores the information of treatment switching in the control group. Other existing methods propose to reconstruct the outcome a unit would have had if s/he had not switched. But these methods usually rely on strong assumptions, for example, there exists no relation between patient’s prognosis and switching behavior, or the treatment effect is constant. Clearly, the switching status of the units in the control group contains important post-treatment information, which is useful to characterize the treatment effect heterogeneity. We propose to re-define the problem of treatment switching using principal stratification and introduce new causal estimands, principal causal effects for patients belonging to subpopulations defined by the switching behavior under control. For statistical inference, we use a Bayesian approach to take into account that (i) switching happens in continuous time generating infinitely many principal strata; (ii) switching time is not defined for units who never switch in a particular experiment; and (iii) survival time and switching time are subject to censoring. We illustrate our framework using a synthetic dataset based on the Concorde study, a randomized controlled trial aimed to assess causal effects on time-to-disease progression or death of immediate versus deferred treatment with zidovudine among patients with asymptomatic HIV infection. Joint work with Alessandra Mattei and Peng Ding.


Centre for Statistical Methodology Seminar

Title: Selecting causal risk factors from high-throughput experiments using multivariable Mendelian randomization.

Verena Zuber (Imperial College London)

Bayesian Theme Seminar

Slides and audio available soon

Abstract: Modern high-throughput experiments provide a rich resource to investigate causal determinants of disease risk. Mendelian randomization (MR) is the use of genetic variants as instrumental variables to infer the causal effect of a specific risk factor on an outcome. Multivariable MR is an extension of the standard MR framework to consider multiple potential risk factors in a single model. However, current implementations of multivariable MR use standard linear regression and hence perform poorly with many risk factors.

Here, we propose a novel approach to multivariable MR based on Bayesian model averaging (MR-BMA) that scales to high-throughput experiments and can select biomarker as causal risk factors for disease. In a realistic simulation study we show that MR-BMA can detect true causal risk factors even when the candidate risk factors are highly correlated. We illustrate MR-BMA by analysing publicly-available summarized data on metabolites to prioritise likely causal biomarkers for cardiovascular disease.


Centre for Statistical Methodology Symposium: “Quantitative approaches to personalised medicine”

12 November 2019

Speakers:

  • Dr John Whittaker (Glaxo Smith Klein Pharmaceuticals): The pharmaceutical industry and personalisation: what have we learnt, and what’s required in future?  Slides (.pdf, 0.1MB)
  • Professor Mihaela van der Schaar (University of Cambridge): Transforming medicine through Artificial Intelligence-enabled healthcare pathways Slides (.pdf, 1.5MB)
  • Dr Brian Tom (MRC Biostatistics Unit, Cambridge): Personalising inter-donation intervals amongst blood donors Slides (.pdf, 0.4MB)
  • Dr Karla Diaz-Ordaz (LSHTM): Using data-adaptive methods to investigate conditional treatment effects: towards personalised treatment regimes Slides (.pdf, 0.9MB)
  • Professor Andrew Briggs (LSHTM): The economics of personalised medicine: threat or opportunity? Slides (.pdf, 0.4MB)
  • Dr Stephen Senn (Independent Statistical Consultant, Edinburgh): A statistical sceptic’s view of personalised medicine Slides (.pdf, 0.9MB)

Centre for Statistical Methodology Seminar

Title: Dealing with missing binary outcomes in cluster randomized trials: weighting vs. imputation methods

Elizabeth L. Turner (Duke University)

Analysis of Clinical Trials Theme

Slides and audio

Abstract: Cluster randomized trials are commonly used to evaluate the impact of public health interventions on a range of outcomes and in a range of global health settings. Yet, most CRTs have some missing outcome data and analysis of available data may be biased when outcome data are not missing completely at random. In this talk, we will focus on analysis of CRTs with binary outcomes using the generalized estimating equations (GEE) approach.

In this context, multilevel multiple imputation for GEE (MMI-GEE) has been widely used and methodological work has been undertaken to evaluate its properties (e.g. see work by LSHTM researchers including Hossain, Diaz-Ordaz and Bartlett). Performance of this method has been shown to be very good but there are some challenges to implementing this procedure in standard software. Alternative approaches such as inverse probability weighted GEE (W-GEE) are less common but may be easier to implement in practice. Therefore, we have evaluated properties of W-GEE methods and compared the results with MMI-GEE for binary outcomes using both simulations and using a real data example from a CRT to evaluate the effect of a teacher-training intervention on child literacy outcomes in Kenya. This is joint work with Lanqiu Yao, Fan Li and Melanie Prague.


Centre for Statistical Methodology Seminar

Title: Causal inference and competing events

Jessica Young (Havard Medical School)

Causal Inference Theme

Slides and audio

Abstract: In failure-time settings, a competing risk event is any event that makes it impossible for the event of interest to occur. For example, cardiovascular disease death is a competing event for prostate cancer death because an individual cannot die of prostate cancer once he has died of cardiovascular disease. Various statistical estimands have been defined as possible targets of inference in the classical competing risks literature.  These include the so-called cause-specific hazard, subdistribution hazard, marginal hazard, cause-specific cumulative incidence and marginal cumulative incidence. Many reviews have described these statistical estimands and their estimating procedures with recommendations about their reporting when the goal is causal effect estimation. However, this previous work has not used a formal framework for characterizing causal effects and their identifying conditions which makes it difficult to evaluate these recommendations, even in a randomized trial with no loss to follow-up. Here we will place these estimands within a counterfactual framework for causal inference in order to:

  1. define counterfactual contrasts in each of these estimands under different treatment interventions
  2. interpret each contrast under data generating assumptions represented by a causal DAG
  3. understand identification of each of these contrasts in data with censoring events, including how identification can be evaluated with causal DAGs and
  4. how the combined choice of estimand and identifying assumptions leads to a choice of estimating procedure

Centre for Statistical Methodology Seminar

Title: Beyond the average: Contrasting targeted learning and causal forests for inference about conditional average treatment effects of social health insurance programmes

Noemi Kreif (University of York)

Big Data and Machine Learning Theme

Slides and audio

Abstract: Researchers evaluating social policies are often interested in identifying individuals who would benefit most from a particular policy. Recently proposed causal inference approaches that incorporate machine learning (ML) have the potential to help explore treatment effect heterogeneity in a flexible yet principled way. We contrast two such approaches in a study evaluating the effects of enrollment in social health insurance schemes on health care utilisation of Indonesian mothers. First, we apply a double-machine learning approach, targeted minimum loss-based estimation (TMLE) where we estimate both the outcome regression and the propensity score flexibly using an ensemble ML approach. From the individual-level predictions of potential outcomes we calculate individual-level treatment effects and use a Random Forest (RF) procedure to identify the variables that predict these effects. We contrast this exploratory approach to an application of the Causal Forests method (Wager and Athey, 2018 JASA), which has been designed to directly estimate heterogeneous treatment effects, by modifying the standard RF algorithm to maximise the variance of the predicted treatment effects. In both analyses we find that the most important effect modifiers include educational status, age and household wealth. When reporting conditional average treatment effects (CATEs) for these subgroups, the methods agree that less well-educated and younger mothers would benefit more from health insurance than well-educated and older ones. The CATEs reported by the Causal Forests have larger confidence intervals than those reported by the TMLE approach, potentially due to the extra sample splitting step employed.


Centre for Statistical Methodology Seminar

Title: Post-“Modern Epidemiology”: when methods meet matter

George Davey Smith (University of Bristol)

Causal Inference Theme

Slides (pdf)

Abstract: In the last third of the 20th century, etiological epidemiology within academia in high-income countries shifted its primary concern from attempting to tackle the apparent epidemic of non-communicable diseases to an increasing focus on developing statistical and causal inference methodologies. This move was mutually constitutive with the failure of applied epidemiology to make major progress, with many of the advances in understanding the causes of non-communicable diseases coming from outside the discipline, while ironically revealing the infectious origins of several major conditions. Conversely, there were many examples of epidemiologic studies promoting ineffective interventions and little evident attempt to account for such failure. Major advances in concrete understanding of disease etiology have been driven by a willingness to learn about and incorporate into epidemiology developments in biology and cognate data science disciplines. If fundamental epidemiologic principles regarding the rooting of disease risk within populations are retained, recent methodological developments combined with increased biological understanding and data sciences capability should herald a fruitful post–modern.


Centre for Statistical Methodology Seminar

Causal Inference Theme

A new approach to generalizability of clinical trials

Anders Huitfeldt (LSE)


Centre for Statistical Methodology Seminar

Causal Inference Theme

Using Quantitative Bias Analysis to Deal with Misclassification in the Results Section, not the Discussion Section.

Matt Fox (Boston University)


Centre for Statistical Methodology Seminar

Statistical Computing Theme

An extended mixed-effects model for meta-analysis: statistical framework and the R package mixmeta.

Antonio Gasparrini and Francesco Sera (LSHTM)


Centre for Statistical Methodology Seminar

Big Data Theme

Large numbers of explanatory variables.

Heather Battey (Imperial College London)

Slides available soon


Centre for Statistical Methodology Seminar

Friday 14 December 2018

Design and analysis of trials where the outcome is a rate of change, with an introduction to a new Stata package for sample size calculation

Chris Frost and Amy Mullick (LSHTM)

Slides and audio


Centre for Statistical Methodology Seminar

Friday 30 November 2018

Uncertainty and missing data in dietary intake and activity data.

Graham Horgan (Rowett Institute, University of Aberdeen)

Slides and audio (external website)


Centre for Statistical Methodology Seminar

Clinical Trials Theme

Friday 23 November 2018

Lessons learned from implementing a stratified medicine master protocol: The National Lung Matrix Trial

Prof Lucinda Billingham (University of Birmingham)

Slides available soon


Centre for Statistical Methodology Seminar

Clinical Trials Theme

Friday 2 November 2018

Response-Adaptive Randomisation: Implementing Optimality Criteria in Clinical Trials

Sofia Villar (MRC Biostatistics Unit, Cambridge)

Slides (pdf)


Centre for Statistical Methodology Seminar

Friday 26 October 2018

Framework and practical tool for eliciting expert priors in clinical trials
with MNAR outcomes

Alexina Mason (LSHTM)

Slides (pdf)


Centre for Statistical Methodology Seminar

Friday 28 September 2018

Assessing comparative effectiveness of cancer treatments in the SEER-Medicare linked database: a causal approach

Lucia Petito (Harvard School of Public Health)

Slides and audio (external website)


Centre for Statistical Methodology Seminar

Missing Data & Measurement Error Theme

6 July 2018

Generating multiple imputation from multiple models to reflect missing data mechanism uncertainty: Application to a longitudinal clinical trial

Prof Ofer Harel (University of Connecticut)

Slides and audio (external website)


Centre for Statistical Methodology Seminar

2 July 2018

Bayesian treatment comparison using parametric mixture priors computed from elicited histograms

Moreno Ursino (Cordeliers Research Centre, Paris)

Slides and audio (external website)


Centre for Statistical Methodology Seminar

Health Economics Theme

29 June 2018

Experiences of structured elicitation cost-effectiveness analyses

Marta Soares (University of York)

Slides and audio (external website)


Centre for Statistical Methodology Seminar

Time Series Regression Analysis Theme

18 May 2018

Case time series: a flexible design for big data epidemiological analyses

Antonio Gasparrini (LSHTM)

Slides and audio (external website)


Centre for Statistical Methodology Seminar

Causal Inference Theme

4 May 2018

How to obtain valid tests and confidence intervals for treatment effects after confounder selection?

Prof Stijn Vansteelandt (University of Ghent & LSHTM)

Slides and audio (external website)


Centre for Statistical Methodology Seminar

Survival Analysis Theme

27 April 2018

Dynamic prediction in fertility

Nan van Geloven (University of Leiden)

Slides and audio (external website)


Early Career Researcher Showcase

26 March 2018

  • Ruth Farmer (LSHTM): Dealing with time dependent confounding in diabetes pharmacoepidemiology: an application of marginal structural models to electronic health care records
  • Baptiste Leurent (LSHTM): Sensitivity analysis for informative missing data in cost-effectiveness analysis
  • Christen Gray (LSHTM): Use of the Bayesian family of methods for correcting exposure measurement error in polynomial regression models
  • Jennifer Thompson (LSHTM): Advice for using generalised estimating equations in a stepped-wedge trial
  • Benedetta Pongiglione (UCL): Disability and all-cause mortality in the older population: evidence from the English Longitudinal Study of Ageing
  • Andrea Gabrio (UCL): Statistical issues in small/pilot cost-effectiveness analyses from individual level data
  • Gillian Stresman (LSHTM): Spatial analysis to understand malaria transmission and the potential for spatially targeted interventions
  • Prof Vern Farewell (MRC Biostatistics Unit): Use of a multi-state model with a composite arthritis outcome

Slides and audio (external website)


Centre for Statistical Methodology Seminar

Big Data Theme

26 January 2018

Statistical methods for real-time monitoring of health outcomes

Prof Peter Diggle (University of Lancaster)

Slides (.pdf, 8.2MB)

Slides and audio (external website)