Centre themes

Analysis of Clinical Trials

Theme Co-ordinators: Neal AlexanderBaptiste LeurentClemence LeyratAmy MulickStephen NashLinda Sharples

Please see here for slides and audio recordings of previous seminars relating to this theme.

Randomised controlled trials (RCTs) are one of the most important tools to estimate effects of medical interventions. However, there are a number of issues relating to the statistical analysis of RCTs over which there is much debate. We aim to consider and raise discussion of a number of these issues, and suggest possible statistical approaches for dealing with these in the analyses of RCTs. We briefly introduce here some of these issues.


  • Cluster randomised trials
  • Covariate adjustment
  • Subgroup analysis
  • Missing data
  • Non-compliance
  • Sequential trials
  • Good practice in trials

Cluster randomised trials

In cluster randomised trials (CRTs), groups of participants, rather than the participants themselves, are randomised to intervention groups1. This design is increasingly used to assess complex interventions, in particular community-level interventions. Many CRTs have been conducted at LSHTM to evaluate the effectiveness of health-care interventions, motivating a wide range of methodological research on their design2,3, analysis4,5 and reporting6 to take into account the specificities of such trials. One such challenge in CRTs arises because participants’ outcome measures are not independent and thus clustering must be accounted for in the analysis. For this reason, CRTs are also a main research area within the Design and Analysis for Dependent Data CSM theme.

Other issues arising from CRTs have been recently studied at the CSM, such as the risk of systematic baseline imbalance7, analysis in the presence of missing data8, and analysis of alternative cluster designs such as cluster cross-over trials9. Ongoing projects include research into spatial analysis of CRTs, the analysis strategy when only a small number of clusters are randomised, and the estimate of causal effects in CRTs with non-compliance.

  1. Hayes RJ, Moulton LH. Cluster Randomised Trials. Taylor & Francis; 2009. 338 p.
  2. Hayes RJ, Alexander ND, Bennett S, Cousens SN. Design and analysis issues in cluster-randomized trials of interventions against infectious diseases. Stat Methods Med Res. 2000 Apr;9(2):95–116.
  3. Thomson A, Hayes R, Cousens S. Measures of between-cluster variability in cluster randomized trials with binary outcomes. Stat Med. 2009 May 30;28(12):1739–51.
  4. Gomes M, Díaz-Ordaz K, Grieve R, Kenward MG. Multiple imputation methods for handling missing data in cost-effectiveness analyses that use data from hierarchical studies: an application to cluster randomized trials. Med Decis Mak Int J Soc Med Decis Mak. 2013 Nov;33(8):1051–63.
  5. Alexander N, Emerson P. Analysis of incidence rates in cluster-randomized trials of interventions against recurrent infections, with an application to trachoma. Stat Med. 2005 Sep 15;24(17):2637–47.
  6. Campbell MK, Piaggio G, Elbourne DR, Altman DG, for the CONSORT Group. Consort 2010 statement: extension to cluster randomised trials. BMJ. 2012 Sep 4;345(sep04 1):e5661–e5661.
  7. Leyrat C, Caille A, Foucher Y, Giraudeau B. Propensity score to detect baseline imbalance in cluster randomized trials: the role of the c-statistic. BMC Med Res Methodol. 2015;
  8. DiazOrdaz K, Kenward MG, Gomes M, Grieve R. Multiple imputation methods for bivariate outcomes in cluster randomised trials. Stat Med. 2016 Sep 10;35(20):3482–96.
  9. Morgan KE, Forbes AB, Keogh RH, Jairath V, Kahan BC. Choosing appropriate analysis methods for cluster randomised cross-over trials with a binary outcome. Stat Med. 2016 Sep 28;

Covariate adjustment

In RCTs, unadjusted analysis provides an unbiased estimate of the treatment effect. However, even though randomisation should insure baseline characteristics (covariates) are broadly balanced between groups, chance imbalances can occur, especially in smaller trials. Covariate adjustment for important predictors of outcome can be used to allow for such imbalances1. In certain circumstances, adjustment using appropriate covariates can also be used to improve the power (or to reduce the required sample size) of an RCT irrespective of any baseline imbalance2.

Within the CSM, work has been conducted to look at the impact of covariate adjustment on power in real settings3, and also to investigate alternative strategies to multivariable regression for covariate adjustment, in particular the use of propensity score weighting4. Furthermore, methodological research has been conducted to extend covariate adjustment methods to more challenging randomised trials such as cross-over5 or cluster randomised trials6, in which both chance and systematic imbalance can occur. Recent work also includes the development of recommendations for the implementation of covariate adjustment in practice7.

  1. Altman DG. Adjustment for Covariate Imbalance. In: Biostatistics in Clinical Trials [Internet]. Chichester, UK: John Wiley & Sons, Ltd; 2001 [cited 2016 Oct 14]. p. 122–7.
  2. Hernández AV, Steyerberg EW, Habbema JDF. Covariate adjustment in randomized controlled trials with dichotomous outcomes increases statistical power and reduces sample size requirements. J Clin Epidemiol. 2004 May;57(5):454–60.
  3. Turner EL, Perel P, Clayton T, Edwards P, Hernández AV, Roberts I, et al. Covariate adjustment increased power in randomized controlled trials: an example in traumatic brain injury. J Clin Epidemiol. 2012 May;65(5):474–81.
  4. Williamson EJ, Forbes A, White IR. Variance reduction in randomised trials by inverse probability weighting using the propensity score. Stat Med. 2014 Feb 28;33(5):721–37.
  5. Kenward MG, Roger JH. The use of baseline covariates in crossover studies. Biostatistics. 2010 Jan 1;11(1):1–17.
  6. Gomes M, Grieve R, Nixon R, Ng ES-W, Carpenter J, Thompson SG. Methods for Covariate Adjustment in Cost-Effectiveness Analysis That Use Cluster Randomised Trials. Health Econ. 2012;21(9):1101–1118.
  7. Pocock SJ, McMurray JJV, Collier TJ. Statistical Controversies in Reporting of Clinical Trials: Part 2 of a 4-Part Series on Statistics for Clinical Trials. J Am Coll Cardiol. 2015 Dec 15;66(23):2648–62.

Subgroup analysis

The analysis of RCTs by subgroups of individuals (e.g. according to age, gender or medical history) remains controversial and often misunderstood1–3. While a well conducted subgroup analysis can be justified, deviation from recommended practices can easily result in dubious findings3. It is recognised2,4 that such analyses should be limited to a few key baseline factors which are specified prior to any analyses being undertaken. In addition, an appropriate analysis should report effect estimates and confidence intervals within such subgroups, together with an overall interaction test, rather than separate p-values within each category of the subgroup. Irrespective of the findings, it is recommended to interpret them with caution until replicated. Sun et al. derived a useful list of criteria to assess the credibility of subgroup analysis findings5.

Alternatives methods which have been explored at the CSM include analysis based on risk score, potentially improving the power to explore subgroup effect, and allowing examination of the absolute net benefits in view of the patients’ baseline risk 4,6,7. Bayesian approach to subgroup analysis is another area of interest, allowing to take into account the clinical plausibility of the subgroup effect into the analysis8.

  1. Assmann SF, Pocock SJ, Enos LE, Kasten LE. Subgroup analysis and other (mis)uses of baseline data in clinical trials. Lancet. 2000;355(9209):1064-1069.
  2. Wang R, Lagakos SW, et al. Statistics in Medicine — Reporting of Subgroup Analyses in Clinical Trials. N Engl J Med. 2007;357:2189-2194.
  3. Wallach JD, Sullivan PG, et al. Evaluation of evidence of statistical support and corroboration of subgroup claims in randomized clinical trials. JAMA Internal Medicine. 2017 Apr 1;177(4):554-60.
  4. Pocock SJ, McMurray JJ V, Collier TJ. Statistical Controversies in Reporting of Clinical Trials Part 2 of a 4-Part Series on Statistics for Clinical Trials. J Am Coll Cardiol. 2015;66(23):2648-2662.
  5. Sun X, Briel M, Walter SD, Guyatt GH. Is a subgroup effect believable? Updating criteria to evaluate the credibility of subgroup analyses. BMJ. 2010;340(9209):c117.
  6. Pocock SJ, Lubsen J. More on Subgroup Analyses in Clinical Trials. 2008;19(8):2076-2077.
  7. Fox KAA, Poole-Wilson P, Clayton TC, et al. 5-year outcome of an interventional strategy in non-ST-elevation acute coronary syndrome: The British Heart Foundation RITA 3 randomised trial. Lancet. 2005;366(9489):914-920.
  8. White IR, Pocock SJ, Wang D. Eliciting and using expert opinions about influence of patient characteristics on treatment effects: A Bayesian analysis of the CHARM trials. Stat Med. 2005;24(24):3805-3821.
  9. Stone GW, Rizvi A, et al. Everolimus-eluting versus paclitaxel-eluting stents in coronary artery disease. N Engl J Med. 2010 May 6;362(18):1663-74.

Missing data

Missing data are a common issue in clinical trials1 and can results in underpowered studies or biased findings. The issue of missing data is an active area of research in the CSM, with a dedicated theme, and website

Several guidelines have been published on missing data in the context of clinical trials2–4 . The report from the National Research Council concludes that methods should be chosen according to the plausibility of their underlying assumptions, and discourage simple fixes such as last observations carried forward. Because assumptions cannot be verified with the data at hand, conducting sensitivity analyses under alternative assumptions is also recommended2–4, however a review by Bell et al. found that this was rarely done in practice1.

Topics of particular interest in the CSM includes the use of multiple imputation5, for example in cluster-randomised trials6–8, and developing practical approaches for sensitivity analysis when data may be missing not at random4,9.

  1. Bell ML, Fiero M, Horton NJ, et al. Handling missing data in RCTs; a review of the top medical journals. BMC Med Res Methodol. 2014;14(1):118.
  2. Burzykowski T, Carpenter J, Coens C, et al. Missing data: Discussion points from the PSI missing data expert group. Pharm Stat. 2010;9(4):288-297.
  3. Little RJ, D’Agostino R, Cohen ML, et al. The prevention and treatment of missing data in clinical trials. N Engl J Med. 2012;367(14):1355-1360.
  4. Carpenter JR, Kenward MG. Missing Data in Randomised Controlled Trials — a Practical Guide.; 2007.
  5. Carpenter JR, Kenward MG. Multiple Imputation and Its Application. John Wiley & Sons; 2013.
  6. DiazOrdaz K, Kenward MG, Gomes M, Grieve R. Multiple imputation methods for bivariate outcomes in cluster randomised trials. Stat Med. 2016;(February). doi:10.1002/sim.6935.
  7. Gomes M, Díaz-Ordaz K, Grieve R, Kenward M. Multiple imputation methods for handling missing data in cost-effectiveness analyses that use data from hierarchical studies: an application to cluster randomized trials. Med Decis Mak. 2013;33(8):1051-1063.
  8. Caille A, Leyrat C, Giraudeau B. A comparison of imputation strategies in cluster randomized trials with missing binary outcomes. Stat Methods Med Res. April 2014
  9. Carpenter JR, Roger JH, Kenward MG. Analysis of longitudinal trials with protocol deviation: a framework for relevant, accessible assumptions, and inference via multiple imputation. J Biopharm Stat. 2013;23(3):1352-1371.

Non compliance

In RCTs, intention-to-treat (ITT) analysis, in which patients are analysed with respect to the intervention they have been allocated to regardless of what they actually received, is considered the gold-standard1. ITT analysis estimates the effectiveness, or, following Carpenter et al. terminology, the de facto estimand, ie. “what would be the effect seen in practice?”2.  It is however also of interest to estimate the effect of the intervention if the patients were to comply with it (the efficacy or de jure estimand). Although per protocol analysis (including only patients who complied with their allocated treatment) have been proposed for this, it is now well recognised that it does not maintain the randomisation and is therefore at risk of bias. Methods to estimate a Complier Average Causal Effect (CACE) have been proposed to estimate the causal effect of the intervention amongst participants who actually received it, such as instrumental variables3 or propensity scores4, but several questions require further investigation, including how to estimate the CACE when there is more than one active treatment, or when the required assumptions of the causal inference framework do not hold (see the CSM Causal Inference theme).

Ongoing research at the School looks at the use of multiple imputation of the compliance strata to estimate the CACE and also on how to tackle non-compliance in specific study designs such as in cluster randomised trials, in which the compliance decision can occur both at the individual and at the cluster level, or in non-inferiority trials, in which an underestimation of the treatment effect with ITT analysis can have important consequences in practice.

  1. Newell DJ. Intention-to-treat analysis: implications for quantitative and qualitative research. Int J Epidemiol. 1992 Oct;21(5):837–41.
  2. Carpenter JR, Roger JH, Kenward MG. Analysis of longitudinal trials with protocol deviation: a framework for relevant, accessible assumptions, and inference via multiple imputation. J Biopharm Stat. 2013;23(6):1352–71.
  3. Angrist JD, Imbens GW, Rubin DB. Identification of Causal Effects Using Instrumental Variables. J Am Stat Assoc. 1996;91(434):444–55.
  4. Porcher R, Leyrat C, Baron G, Giraudeau B, Boutron I. Performance of principal scores to estimate the marginal compliers causal effect of an intervention. Stat Med. 2015 Sep 17.

Sequential trials

Sequential designs were originally based on industrial quality control, to monitor whether a process resulted in an abnormally high proportion of defects1.  They focus on decision making rather than estimation.  The simplest sequential trial designs involve assessing each person as their outcome becomes available, although they can be modified to analyse at regular intervals2-4.  These modified designs can be used in a similar way as those based on the alpha-spending group sequential approach5, although the two approaches are based on different philosophies.

One of the simpler sequential designs is the triangular test for a single proportion.  This is illustrated in the following figure which is from a non-comparative trial of three treatments for visceral leishmaniasis in East Africa6.

The outcome of interest is the number of patients who recovered, which is captured on the vertical axis. The horizontal axis (V) is proportional to the number of people recruited to the trial so far. The trial continues while the findings fall in the triangular region, which is calculated based on the acceptable cure rate and error probabilities.  Findings above the upper boundary (as in the case here for the three treatments at the third time-point), indicate a favourable conclusion, while region below the triangle indicate an unfavourable conclusion.

Aspects of sequential designs being researched by members of the CSM and their collaborators include analysis methods for time points subsequent to assessment of the primary endpoint, and development of asymmetrical stopping boundaries e.g. the imposition of a minimum sample size7,8.

  1. Wald, A. Sequential Analysis. (Wiley, 1947).
  2. Whitehead, J. The Design and Analysis of Sequential Clinical Trials. 1st edn, (Ellis Horwood, 1983).
  3. Bellissant, E., Benichou, J. & Chastang, C. Application of the triangular test to phase II cancer clinical trials. Stat Med 9, 907-917 (1990).
  4. Ranque, S., Badiaga, S., Delmont, J. & Brouqui, P. Triangular test applied to the clinical trial of azithromycin against relapses in Plasmodium vivax infections. Malar J 1, 13 (2002).
  5. Lan, K. K. G. & DeMets, D. L. Discrete sequential boundaries for clinical trials. Biometrika 70, 659-663 (1983).
  6. Wasunna, M. et al. Efficacy and Safety of AmBisome in Combination with Sodium Stibogluconate or Miltefosine and Miltefosine Monotherapy for African Visceral Leishmaniasis: Phase II Randomized Trial. PLoS Negl Trop Dis 10, e0004880.
  7. Omollo, R. et al. Safety and efficacy of miltefosine alone and in combination with sodium stibogluconate and liposomal amphotericin B for the treatment of primary visceral leishmaniasis in East Africa: study protocol for a randomized controlled trial. Trials 12, 166 (2011).
  8. Allison, A. et al. Generalizing boundaries for triangular designs, and efficacy estimation at extended follow-ups. Trials 16, 522, doi:10.1186/s13063-015-1018-1 (2015).

Good Practice in Trials

Clinical trials need to comply with different regulatory and legal requirements depending on the jurisdiction(s) to which they are subject.  In particular, they usually need to comply with Good Clinical Practice (GCP) as defined by the International Conference on Harmonization (

In addition, some reporting guidelines have become de facto standards due to their adoption by medical journals.  For clinical trials the most important is CONSORT1, or CONsolidated Standards of Reporting Trials (, whose use is encouraged by the International Committee of Medical Journal Editors.

As well as contributing to the development of CONSORT and other reporting guidelines, LSHTM researchers have written prominent journal articles on the principles and practice of clinical trials.  These include guidance on trial design2,3, writing and interpreting clinical trial reports4, interpreting results from trials which did, or did not, find an effect of the intervention on the primary endpoint5,6, and current statistical controversies in reporting clinical trials7.

  1. Schulz, K. F., Altman, D. G. & Moher, D. CONSORT 2010 Statement: updated guidelines for reporting parallel group randomised trials. BMC Med 8, 18, doi:1741-7015-8-18 (2010).
  2. Pocock, S. J., Clayton, T. C. & Stone, G. W. Challenging Issues in Clinical Trial Design: Part 4 of a 4-Part Series on Statistics for Clinical Trials. J Am Coll Cardiol 66, 2886-2898 (2015).
  3. Pocock, S. J., Clayton, T. C. & Stone, G. W. Design of Major Randomized Trials: Part 3 of a 4-Part Series on Statistics for Clinical Trials. J Am Coll Cardiol 66, 2757-2766 (2015).
  4. Pocock, S. J., McMurray, J. J. & Collier, T. J. Making Sense of Statistics in Clinical Trial Reports: Part 1 of a 4-Part Series on Statistics for Clinical Trials. J Am Coll Cardiol 66, 2536-2549 (2015).
  5. Pocock, S. J. & Stone, G. W. The Primary Outcome Is Positive – Is That Good Enough? N Engl J Med 375, 971-979 (2016).
  6. Pocock, S. J. & Stone, G. W. The Primary Outcome Fails – What Next? N Engl J Med 375, 861-870, doi:10.1056/NEJMra1510064 (2016).
  7. Pocock, S. J., McMurray, J. J. & Collier, T. J. Statistical Controversies in Reporting of Clinical Trials: Part 2 of a 4-Part Series on Statistics for Clinical Trials. J Am Coll Cardiol 66, 2648-2662 (2015).

Bayesian Statistics


Big Data and Machine Learning

Theme Co-ordinators: Elizabeth WilliamsonNuno SepulvedaJan van der MeulenLuigi Palla

Please see here for slides and audio recordings of previous seminars relating to this theme.


  • Big data – a quick overview
  • Some key methodological issues
  • Some areas of application
  • Events

In recent years, big data has become the new hype in biomedical research due to great achievements in technological development. Benchmark examples of big data are (i) online tracking of flu epidemics, (ii) the genetic and genomic analyses of many human diseases, and (iii) the analysis of millions of health and hospital records. However, the explosion of big data applications has brought with it interesting methodological questions, such as how to best store, manage, analyse and integrate such ever-increasing data.

This theme aims to provide a sharing space for methodological development on big data problems and its dissemination across the LSHTM research community.

Some of the key methodological issues that members of our theme are working on are:

  • Methods for assessing and improving data quality
  • Missing and poorly measured data
  • Data linkage
  • Data mining, multivariate statistics
  • Causal inference for big data
  • Phenotyping
  • Stochastic models for high throughput technologies
  • Geo-mapping
  • Prediction
  • Machine learning

Some areas of application using big data within the school are:

  • Environmental epidemiology
  • Health service evaluation
  • Health economics studies
  • Pharmacoepidemiology
  • Nutritional epidemiology
  • Genomic epidemiology
  • ‘Omics integration and systems biology
  • Sero-epidemiology of infectious disease
  • Analysis of microbiome

Statistical/methodological issues

Methods for assessing and improving data quality

The 3 V’s for big data – volume, variety, and velocity – were quickly succeeded by the 4 V’s, adding in “veracity”. It was quickly recognised that the most sophisticated big data analytics could not overcome limitations of poorly captured data. Increasing electronic capture and storage of information does not, unfortunately, guarantee good data quality.

There is a relative paucity of methodological work to assess and improve the quality of data in big data settings. However, better detection of errors, leading to enhanced chances of correcting erroneous data, is essential for the validity of subsequent analysis.

Various approaches for detecting likely errors in data have been proposed. In the context of longitudinal data within routinely collected primary care data, one promising method developed in collaboration with members of our theme uses an iterative approach of fitting mixed models, identifying likely outliers, and re-fitting the model after removal of outliers (Welch et al, 2012).

Some relevant references:

Welch C, Petersen I, Walters K, Morris RW, Nazareth I, Kalaitzaki E, White IR, Marston L, Carpenter J. Two-stage method to remove population- and individual-level outliers from longitudinal data in a primary care database. Pharmacoepidemiology and Drug Safety, 2012; 21: 725-732.

Missing data

The challenge of missing data is not restricted to the context of large datasets of routinely or semi-automatically collected data. However, missing data in such settings raises complex and often novel challenges; we highlight two below.

The first is referred to as ‘data dependent sampling’ – in other words the process you are trying to collect data on controls – to some extent – the data you are able to collect. To give two examples:

  • using wearable devices to measure activity can over-estimate usual activity due to participants choosing to leave their device at home on low activity days; this is a form of measurement error
  • in routinely collected primary care data, clinical and therapeutic information is collected only when the patient chooses to visit their general practitioner – and then only for reasons specifically relevant to the consultation.

The second challenge arises because of the sheer volume of the data. While imputation and related approaches are a flexible and powerful approach and offer much potential, they must be adapted to meet these challenges, which can violate their underpinning assumptions.

Some recent work within the group that has addressed some of these challenges:

  • two-fold imputation, an adaption of multiple imputation which attempts to simplify the problem by conditioning only on measurements which are local in time (Welch et al, 2014); and
  • a paper correcting misconceptions on the use of multiple imputation to handle missing data in propensity score analyses (Leyrat et al, 2017)

Some relevant references:

Welch C, Petersen I, Bartlett J, White IR, Marston L, Morris RW, Nazareth I, Walters K, Carpenter J. Evaluation of two-fold fully conditional specification multiple imputation for longitudinal electronic health record data. Statist. Med. 2014, 33:3725–3737.

Leyrat C, Seaman SR, White IR, Douglas I, Smeeth L, Kim J, Resche-Rigon M, Carpenter JR, Williamson EK. Propensity score analysis with partially observed covariates: How should multiple imputation be used? Stat Methods Med Research, 2017, doi: 10.1177/0962280217713032. [Epub ahead of print]

Causal inference

Assessing causal relationships from non-randomised data poses many methodological challenges, particularly related to confounding and selection bias. These are exacerbated in studies conducted using routinely collected data: data not collected for the primary purpose of research tend to be less regular and less complete than traditional data sources used to address such questions.

In comparative effectiveness studies of medications, there is often a wealth of information available regarding previous diagnoses, medications, referrals and therapies. However, how best to incorporate this information into analyses remains unclear. The high-dimensional propensity score (Schneeweiss et al, 2009) is an empirical algorithm to select potential confounders, prioritise candidates, and incorporate selected variables into a propensity-score based statistical model. This algorithm was developed in the context of US claims data; the validity of its application to different settings, such as routinely collected primary care data in the UK, remains unclear.

An alternative approach to the incorporation of a large number of potential confounders into a causal model is offered by Targeted Maximum Likelihood Estimation (TMLE). This approach has been applied to UK primary care data to investigate the association between statins and all-cause mortality (Pang et al, 2016), with the authors concluding that a deeper understanding of the comparative advantages and disadvantages of this approach was needed within this big-data setting.

To begin to address this knowledge gap, members of our theme have developed a free open source online tutorial to introduce TMLE for Causal Inference, which can be found here.

They have also created and made available a free open source Stata program to implement double-robust methods for causal inference, including Machine Learning algorithms for prediction (see links below).

A promising approach to poorly measured, or unmeasured, confounding is offered by self-controlled designs. The self-control risk interval, case-crossover and self-controlled case series, for example, the self-controlled case series uses individuals as their own control, thus removing time-invariant confounders.

Some relevant references:

Franklin JM, Schneeweiss S, Solomon DH. Assessment of Confounders in Comparative Effectiveness Studies From Secondary Databases. Am J Epidemiol. 2017; 185(6): 474-478. doi: 10.1093/aje/kww136.

Franklin JM, Eddings W, Austin PC, Stuart EA, Schneeweiss S. Comparing the performance of propensity score methods in healthcare database studies with rare outcomes. Stat Med. 2017; 36(12): 1946-1963. doi: 10.1002/sim.7250.

Schneeweiss S, Rassen JA, Glynn RJ, Avorn J, Mogun H, Brookhart MA. High-dimensional propensity score adjustment in studies of treatment effects using health care claims data. Epidemiology. 2009; 20(4): 512-522.

Pang M, Schuster T, Filion KB, Eberg M, Platt RW. Targeted maximum likelihood estimation for pharmacoepidemiologic research. Epidemiology. 2016; 27(4): 570-577.

Kang JD, Schafer JL. Demystifying double robustness: A comparison of alternative strategies for estimating a population mean from incomplete data. Statistical Science. 2007: 523-39.

Schuler MS, Rose S. Targeted Maximum Likelihood Estimation for Causal Inference in Observational Studies. American Journal of Epidemiology. 2016. doi: 10.1093/aje/kww165.

S Gruber and MJ van der Laan. tmle: An R Package for Targeted Maximum Likelihood Estimation. Journal of Statistical Software. 2012; 51(13).

Gruber. Targeted Learning in Healthcare Research. Big Data. 2016; 3(4), 211-218. DOI:10.1089/big.2015.0025.

Software (open source):

Author: Dr. Miguel Angel Luque-Fernandez, LSHTM.

Online tutorial:

Author: Dr. Miguel Angel Luque-Fernandez, LSHTM.


Linkage has been described as “a merging that brings together information from two or more sources of data with the object of consolidating facts concerning an individual or an event that are not available in any separate record” (Organisation for Economic Co-operation and Development (OECD) Glossary of Statistical Terms).

Linking health related datasets offers the opportunity to improve data quality, by improving ascertainment of key risk-factors and outcomes, allowing inconsistencies to be identified and resolved.  It is a cost-effective means of assembling a dataset, exploiting existing resources.  However, challenges associated with data linkage include the lack of unique identifiers for linkage, leading to possible errors in the linkage, and data security considerations.

Small amounts of linkage error can result in substantially biased results. False matches introduce variability and weaken the association between variables, often resulting in bias to the null, and missed matches reduce the sample size and result in a loss of statistical power and potential selection bias. Evaluating the potential impact of linkage error on results is vital (Harron et al, 2014).

Some relevant references:

Harron K, Wade A, Gilbert R, Muller-Pebody B, Goldstein H. Evaluating bias due to data linkage error in electronic healthcare records. BMC Medical Research Methodology, 2014, 14: 36.  DOI: 10.1186/1471-2288-14-36.


Environmental epidemiology

In recent decades, the research community has made important steps forward in understanding the relationship between exposure to environmental factors and human health. Big data technologies offer the opportunity to extend this research further, for instance by making available high-resolution exposure data from remote sensing tools and real-time measurement from smartphone mobile applications, and by linking electronic health records including large collections of variables on health data and personal characteristics. However, this new setting requires the development of novel analytical methods for handling complex data structures and for modelling individual risk profiles with longitudinal measures on time-varying exposures, health outcomes and susceptibility factors. This new ‘big data’ framework can improve the analytical capability of environmental health studies and extend our knowledge on the complex pathways linking exposures to environmental stressors and human health.

Some relevant references:

Gasparrini, A et al. Mortality risk attributable to high and low ambient temperature: a multicountry observational study.  The Lancet, Volume 386, Issue 9991, 369 – 375.

Di Q, Wang Y et al. Air pollution and mortality in the Medicare population. New Eng J Med. 2017, Volume 376, No 26, 2513-22.


Large-scale routinely collected health data provide unprecedented potential for population-based health research.  The advantages of using these data include the low cost and timeliness of the research, greatly increased population coverage, and increased statistical power. The US Food and Drug Administration and European Medicines Agency now mandate the use of “real world evidence” of medication effects in drug licensing; in practice such real world evidence often comes from studies incorporating routinely collected health data.

Use of routinely collected data to establish causal relationships, however, raises a number of challenges. Information bias, due to low quality or missing information, remains an issue despite financial incentive schemes aimed at improving data quality such as the Quality Outcomes Framework in the UK. Linkage between data sources improves capture of key outcomes and exposures, but brings additional potential sources of bias. Confounding by poorly measured or unmeasured factors complicates the comparison between groups prescribed different medications. Related to this, the very reasons drugs are prescribed are often highly correlated with the outcomes we wish to study, and this “confounding by indication” remains a key challenge in pharmacoepidemiology.

Despite these challenges, there have been notable successes. For example, members of our theme used linked primary and secondary care data to replicate well known results from randomised trials regarding the effect of statins on vascular outcomes. The same study was used to demonstrate no association between statins and cancer; the absence of this link was later confirmed by randomised trials. Self-controlled designs have great potential to remove time-invariant confounding, and have been successfully implemented by our group to investigate a wide range of associations between vaccines/drugs and adverse outcomes.

Some relevant references:

Smeeth L, Douglas I, Hall AJ, Hubbard R, Evans S. Effect of statins on a wide range of health outcomes: a cohort study validated by comparison with randomised trials. Br J Clin Pharmacol. 2009; 67(1): 99-109.

Nutritional epidemiology

The effects of dietary intake (containing many different foods and nutrients) on health are complex. Understanding specific effects requires taking into account the interactions among dietary exposures which, as it is now recognised, should be analysed jointly to enucleate dietary patterns, in order to better summarise the effect of food intake on health. To this end, multivariate statistical methods like principal component, cluster and factor analysis are needed.

As food intake can be difficult to measure as an epidemiological exposure, attempts at measurement of dietary behaviour via  (validated) metabolic biomarkers are generally underway, which has linked the field of nutritional epidemiology to the Omics field and the various methodological issues characterising chemometric data obtained via Nuclear Magnetic Resonance (NMR) and high-throughput Mass Spectrometry (MS).

Additional Big Data complexities arise in nutritional surveys conducted through dietary diaries as these record the eating occasions at different times for each individual involved, which results in very large number of observations that on the one hand can be used for data mining and hypothesis generation (e.g. in the context and timing of eating) through multivariate methods and on the other call for methodological developments to accommodate their complex hierarchical structure.

Some relevant references:

Gleason PM, Boushey CJ, Harris JE et al. Publishing nutrition research: A review of Multivariate Techniques- Part 3: Data Reduction Methods, 2015, J Acad Nutr Diet. 2015;115:1072-1082.

Assi N, Moskat A, Slimani N et al. A treelet transform analysis to relate nutrient patterns to the risk of hormonal receptor-defined breast cancer in the European Prospective Investigation into Cancer and Nutrition (EPIC). Public Health Nutr, 2016 Feb; 19(2): 242-5.

Chapman A, Beh E, Palla L. Application of Correspondence Analysis to Graphically Investigate Associations Between Foods and Eating Locations Studies in Health Technology and Informatics, 2017; 235: 166-170.


On 14 March 2017 we held an introductory workshop to discuss shared methodological interests within Big Data across the school. Please see here for slides and audio recordings.

On 7 July 2017 we held a half-day symposium on “Statistical Methods for Big Data”. Please see here for the full details and here for slides and audio recordings.

Through the year, we will organise a series of workshops and seminars aimed at bringing together researchers encountering methodological challenges in analysing big data, and methodologists with interests in relevant areas.

Causal Inference

Theme Co-ordinators: Simon CousensKarla Diaz-OrdazRichard SilverwoodRuth KeoghStijn Vansteelandt.

Please see here for slides and audio recordings of previous seminars relating to this theme.

This field was recently criticised in a paper by Vandenbroucke et al. Members of this theme have written a response to these criticisms, which is available here.


As part of the CSM’s activities, seminars on causal inference are often organised. Past speakers include Philip DawidVanessa DidelezRichard EmsleyMiguel HernánErica Moodie and Anders Skrondal.

Details of upcoming seminars can be found here.

UK Causal Inference Meeting 2016

We were delighted to host – together with the London School of Economics and Political Science – the 4th annual UK Causal Inference Meeting, which took place at LSHTM from 13-15 April 2016. More details can be found here.

Other events

As well as regular research seminars, we occasionally organise one-off events on topics of particular interest. For example, we are currently planning a half-day meeting on recent controversies in propensity scores. Details of upcoming events can be found here.

A brief overview of a vast and rapidly-expanding subject

Causal inference is a central aim of many empirical investigations, and arguably most studies in the fields of medicine, epidemiology and public health. We would like to know ‘does this treatment work?’, ‘how harmful is this exposure?’, or ‘what would be the impact of this policy change?’.

The gold standard approach to answering such questions is to conduct a controlled experiment in which treatments/exposures are allocated at random, all participants adhere perfectly to the treatment assigned, and all the relevant data are collected and measured without error. Provided that we can then discount ‘chance’ alone as an explanation, any observed differences between treatment groups can be given a causal interpretation (albeit in a population that may differ from the one in which we are interested).

In the real world, however, such experiments rarely attain this ideal status, and for many important questions, such an experiment would not even be ethically, practically, or economically feasible, and our investigations must be based instead on observational data. In reality, therefore, causal inference is a very ambitious goal. However, since it undeniably is the only useful goal in so many contexts, we must try our best. This involves carefully formulating the causal question to be tackled, explicitly stating the assumptions under which the answers may be trusted, often considering novel analysis methods that may require weaker assumptions than would be required by traditional approaches, and finally using sensitivity analyses to explore how robust our conclusions are to violations of the assumptions.

Historically, even when attempting causal inference, the role of statistics was seen to be to quantify the extent to which ‘chance’ could explain the results, with concerns over systematic biases due to the non-ideal nature of the data relegated to the qualitative discussion of the results. The field known as causal inference has changed this state of affairs, setting causal questions within a coherent framework which facilitates explicit statement of all the assumptions underlying the analysis and allows extensive exploration of potential biases. In the paragraphs that follow, we will attempt a brief overview.

 A language for causal inference (potential outcomes and counterfactuals)

Over the last thirty years, a formal statistical language has been developed in which causal effects can be unambiguously defined, and the assumptions needed for their identification clearly stated. Although alternative frameworks have been suggested (see, for example, Dawid, 2000) and developed, the language which has gained most traction in the health sciences is that of potential outcomes, also called counterfactuals (Rubin, 1978).

Suppose X is a binary exposure, Y a binary outcome, and C a collection of potential confounders, measured before X. We write Y0 and Y1 for the two potential outcomes; the first is the outcome that would be seen if X were set (possibly counter to fact) to 0, and the second is what would be seen if X were set to 1. Causal effects can then be expressed as contrasts of aspects of the distribution of these potential outcomes. For example:

E(Y1) –E(Y0) E(Y1|X=1)/E(Y0|X=1) log[E(Y1|C)/{1–E(Y1|C)}] –log[E(Y0|C)/{1–E(Y0|C)}]


Causal inference methods for economic evaluations of longitudinal interventions

Noemi Kreif (LSHTM)

Noemi holds a Medical Research Council Early Career Fellowship in the Economics of Health, on improving statistical methods to address confounding in the economic evaluation of health interventions, including a collaboration with Dr Maya Petersen at UC Berkeley Division of Biostatistics. She is investigating advanced causal inference methods for the setting of economic evaluations of longitudinal interventions. In particular, she is using targeted maximum likelihood estimation and machine learning to compare dynamic treatment regimes, using a non-randomised study on nutritional intake of critically ill children.

The latter formalises the notion of “no unmeasured confounders”.

The increased clarity afforded by this language has led to increased awareness of causal pitfalls (such as the ‘birthweight paradox’ – see Hernández-Díaz et al, 2006) and the building of a new and extensive toolbox of statistical methods especially designed for making causal inferences from non-ideal data under transparent, less restrictive and more plausible assumptions than were hitherto required.

Of course this does not mean that all causal questions can be answered, but at least they can be formally formulated and the plausibility of the required assumptions assessed.

Considerations of causality are not new. Neyman used potential outcomes in his PhD thesis in the 1920s, and who could forget Bradford Hill’s much-cited guidelines published in 1965? The last few decades, however, have seen the focus move towards developing solutions, as well as acknowledging limitations.

Traditional methods

Not all reliable causal inference requires novel methodology. A carefully-considered regression model, with an appropriate set of potential confounders (possibly identified using a causal diagram – see below) measured and appropriately included as covariates, is a reasonable approach in some settings.

Causal diagrams

An ubiquitous feature of methods for estimating causal effects from non-ideal data is the need for untestable assumptions regarding the causal structure of the variables being analysed (from which conditions such as conditional exchangeability can be deduced). Such assumptions are often represented in a causal diagram or graph, with variables identified by nodes and the relationships between them by edges. The simplest and most commonly-used class of causal diagram is the (causal) directed acyclic graph (DAG), in which all edges are arrows, and there are no cycles, i.e. no variable explains itself (Greenland et al, 1999). These are used not only to represent assumptions but also to inform the choice of a causally-interpretable analysis, specifically to help decide whichvariables should be included as confounders.

Fully-parametric approaches to problems involving many variables

Another common feature of causal inference methods is that, as we move further from the ideal experimental setting, more aspects of the joint distribution of the variables must be modelled, which would have been ancillary had the data arisen from a perfect experiment. Structural equation modelling (SEM) (Kline, 2011) is a fully-parametric approach, in which the relationship between each node in the graph and its parents is specified parametrically. This approach offers an elegant (full likelihood) treatment of ignorable missing data and measurement error, when this affects any variable for which validation or replication data are available.

Semiparametric approaches

Concerns over the potential impact of model misspecification in fully-parametric approaches have led to the development of alternative semiparametric approaches to causal inference, in which the number of additional aspects to be modelled is reduced. These include methods based on the propensity score (Rosenbaum and Rubin, 1983), including inverse probability weighting, and g-estimation, and the so-called doubly-robust estimation proposed by Robins, Rotnitzky and others.

Inferring the effects of time-varying exposures

Novel causal inference methods are particularly relevant for studying the causal effect of a time-varying exposure on an outcome, because standard methods fail to give causally-interpretable estimators when there exist time-varying confounders of the exposure and outcome that are themselves affected by previous levels of the exposure. Methods developed to deal with this problem include the fully-parametric g-computation formula (Robins, 1986), and two semiparametric approaches: g-estimation of structural nested models (Robins et al, 1992), and inverse probability weighted estimation of marginal structural models (Robins et al, 2000). For an accessible tutorial on these methods, see Daniel et al (2013). Related to this longitudinal setting is the identification of optimal treatment regimes, for example in HIV/AIDS research where questions such as ‘at what level of CD4 should HAART (highly active antiretroviral therapy) be initiated?’ are often asked. These can be addressed using the methods listed above, and other related methods (see Chakraborty and Moodie, 2013).

Instrumental variables and Mendelian Randomisation

It is important to appreciate that non-ideal experimental data (e.g. suffering from noncompliance, missing data or measurement error) are not on a par with data arising from observational studies (as may be inferred from what is written above). Randomisation can be used as a tool to aid causal inference even when the randomised experiment is ‘broken’, for example as a result of non-compliance to randomised treatment. Such methods make use of randomisation as an instrumental variable (Angrist and Pischke, 2009). Instrumental variables have even been used with observational data, in particular when the instrument is a variable that holds genetic information (in which case it is known as Mendelian randomisation; see Davey Smith and Ebrahim, 2003) with genotype used in place of randomisation. This is motivated by the idea that genes are ‘randomly’ passed down from parents to offspring in the same way that treatment is allocated in double-blind randomised trials. Although this assumption is generally untestable (Hernán and Robins, 2006), there are situations in which it may be deemed more plausible than the other candidate set of untestable assumptions, namely conditional exchangeability.

Mediation analysis

Approaches (such as SEM) amenable to complex causal structures have opened the way to looking beyond the causal effect of an exposure on an outcome as a black box, and to asking ‘how does this exposure act?’. For example, if income has a positive effect on certain health outcomes, does this act simply by increasing access to health care, or are there other important pathways? Addressing such questions is the goal of mediation analysis and the estimation of direct/indirect effects (see Emsley et al, 2010, for a review). This area has seen an explosion of new methodology in recent years, with several semiparametric alternatives to SEM introduced.

Suggested Introductory Reading

Hernán MA, Robins JM (to appear, 2015) Causal Inference. Chapman & Hall/CRC. [First fifteen chapters available for download here.]

Greenland S, Pearl J, Robins JM (1999) Causal diagrams for epidemiologic research.Epidemiology10(1):37–48.

Hernán MA, Hernández-Díaz S, Werler MM, Mitchell AA (2002) Causal knowledge as a prerequisite for confounding evaluation: an application to birth defects epidemiology. American Journal of Epidemiology155:176–184.

Pearl J (2010) An Introduction to Causal Inference. The International Journal of Biostatistics. 6(2): Article 7.

Angrist JD, Pischke J (2009) Mostly harmless econometrics: an empiricist’s companion. Princeton University Press.

Other references

Chakraborty B, Moodie EEM (2013) Statistical methods for dynamic treatment regimes. Springer.

Daniel RM, Cousens SN, De Stavola BL, Kenward MG, Sterne JAC (2013) Methods for dealing with time-dependent confounding. Statistics in Medicine32(9):1584–1618.

Davey Smith G, Ebrahim S (2003) ‘Mendelian randomization’: can genetic epidemiology contribute to understanding environmental determinants of disease? International Journal of Epidemiology32:1–22.

Dawid, AP (2000) Causal inference without counterfactuals. Journal of the American Statistical Association95(450):407–448.

Emsley RA, Dunn G, White IR (2010) Mediation and moderation of treatment effects in randomised controlled trials of complex interventions. Statistical Methods in Medical Research19(3):237–270.

Greenland S, Pearl J, Robins JM (1999) Causal diagrams for epidemiologic research.Epidemiology10(1):37–48.

Hernán MA, Robins JM (2006) Instruments for causal inference: an epidemiologist’s dream? Epidemiology17:360–372.

Hernández-Díaz S, Schisterman EF, Hernán MA (2006) The birth weight “paradox” uncovered? American Journal of Epidemiology. 164: 1115–1120.

Kline RB (2011) Principles and Practice of Structural Equation Modeling, 3rd ed. The Guilford Press.

Robins JM (1986) A new approach to causal inference in mortality studies with a sustained exposure period – application to control of the healthy worker survivor effect. Mathematical Modelling. 7:1393–1512.

Robins JM, Blevins D, Ritter G, Wulfsohn M (1992) G-estimation of the effect of prophylaxis therapy for pneumocystis carinii pneumonia on the survival of AIDS patients. Epidemiology3:319–336.

Robins JM, Hernán MA, Brumback B (2000) Marginal structural models and causal inference in epidemiology. Epidemiology11:550–560.

Rosenbaum PR, Rubin DB (1983) The central role of the propensity score in observational studies for causal effects. Biometrika. 70(1):41–55.

Rubin DB (2006) Bayesian inference for causal effects: the role of randomisation. The Annals of Statistics. 6:34–58.

Recent and on-going methodological research in Causal Inference at LSHTM

Causal mediation analysis

Researchers from LSHTM are involved in several strands of research on mediation analysis, including dealing with multiple mediators, intermediate confounding and latent variables, specifically in studies of birthweight and infant mortality.

Mediation analysis in the presence of intermediate confounders and the links with SEM

Bianca De Stavola, Rhian Daniel (LSHTM) and George Ploubidis (IoE)

Intermediate confounders, ie variables that confound the mediator-outcome relationship and are affected by the exposure, are problematic for the decomposition of causal effects into direct and indirect components. The sufficient conditions most commonly cited for identifying natural direct and indirect effects (Pearl, 2001) include the so-called “cross-world assumption”, that conditionally on baseline confounders C, the counterfactuals Y(x,m) and M(x*) should be independent, even when x≠x*. This assumption precludes the existence of intermediate confounders. However, identification is also possible when this assumption is replaced by a weaker one (Petersen et al, 2006) namely that E{Y(1,m)-Y(0,m)|M(0)=m,C=c} = E{Y(1,m)-Y(0,m)|C=c}. Alternatively, Robins and Greenland (1992) showed that identification is also possible when this assumption is replaced by the condition that there can be no X-M interaction even on an individual level, ie that, for each subject i, Yi(1,m)-Yi(0,m) is the same for all levels of m. Both the Petersen et al assumption, and that of Robins and Greenland, can hold when intermediate confounding is present, but they imply restrictions on the form of the associational models to be fitted. In this work, we discuss these restrictions, together with further results, and in-so-doing clarify the link between the causal inference and SEM approaches to mediation analysis.

We have also written a routine in Stata (gformula) for estimating controlled direct effects and natural direct and indirect effects (or their randomized interventional analogues) in the presence of intermediate confounding using a fully-parametric approach via Monte Carlo simulation.

Daniel RM, De Stavola BL and Cousens SN (2011) gformula: Estimating causal effects in the presence of time-varying confounding or mediation using the g-computation formula. The Stata Journal11(4):479–517.

De Stavola BL, Daniel RM, Ploubidis GB, Micali N (2015) Mediation analysis with intermediate confounding: structural equation modelling viewed through the causal inference lens. American Journal of Epidemiology. 181(1):64–80.

Mediation analysis with multiple mediators

Rhian Daniel, Bianca De Stavola and Simon Cousens (LSHTM) and Stijn Vansteelandt (University of Ghent)

The many recent contributions to the causal inference approach to mediation analysis have focused almost entirely on settings with a single mediator of interest, or a set of mediators considered en bloc; in many applications, however, researchers attempt a much more ambitious decomposition into numerous path-specific effects through many mediators. In this work, we gave counterfactual definitions of such path-specific estimands in settings with multiple mediators, when earlier mediators may affect later ones, showing that there are many ways in which decomposition can be done. We discussed the strong assumptions under which the effects are identified, suggesting a sensitivity analysis approach when a particular subset of the assumptions cannot be justified. The aim was to bridge the gap from “single mediator theory” to “multiple mediator practice,” highlighting the ambitious nature of this endeavour and giving practical suggestions on how to proceed.

Daniel RM, De Stavola BL, Cousens SL, Vansteelandt S (2015) Causal mediation analysis with multiple mediators. Biometrics. 71(1):1–14.

The (low) birthweight paradox

Bianca De Stavola, Richard Silverwood (LSHTM)

Overall, maternal factors such as low socio-economic position and smoking lead to a higher incidence of infant mortality. However, this relationship has been found to be reversed for babies of low birthweight, with factors such as maternal smoking appearing protective. This has been termed the (low) birthweight paradox and various explanations have been offered. One of these is that the apparent reversal of the effect is due to unaccounted confounding between birthweight and infant mortality. We are currently investigating this phenomenon in the ONS and Scotting Longitudinal Studies. The methodological aspect of this work involves incorporating a latent class approach to account for some of the unmeasured confounding.

Longitudinal causal effects and time-dependent confounding

In the setting of longitudinal data, LSHTM researchers are involved in methods for inferring short-term and total effects, methods for use when there is strong confounding, methods for use with routinely-collected data, as well as having recently been involved in pedagogic and software-development work (to show/hide references, ).

Daniel RM, Cousens SN, De Stavola BL, Kenward MG, Sterne JAC (2013) Methods for dealing with time-dependent confounding. Statistics in Medicine32(9):1584–1618.

Daniel RM, De Stavola BL and Cousens SN (2011) gformula: Estimating causal effects in the presence of time-varying confounding or mediation using the g-computation formula. The Stata Journal11(4):479–517.

Application of Marginal Structural Models (MSMs) with Inverse Probability of Treatment Weighting (IPTW) to primary care records in the clinical area of diabetes

Ruth Farmer, Krishnan Bhaskaran (LSHTM) and Debbie Ford (MRC CTU)

Existing research is conflicting over whether the first line treatment of metformin for type 2 diabetes is protective against the development of cancer. Within this context, time varying measures such as blood glucose level (HBa1c) and BMI are determinants of treatment, may be affected by prior treatment, and may also have an independent effect on risk of cancer. Work is ongoing to apply MSMs with IPTW to deal with time dependent confounding in this context, using data from the Clinical Practise Research Datalink (CPRD). This will be one of the first attempts to apply MSM methodology to a real-world problem in a “big data” setting. There is a particular methodological focus on how the diabetes context and use of routinely collected data may lead to violation of the underlying assumptions needed for the MSM to produce valid causal inferences, and potential solutions to this.

Inferring short-term and total effects from longitudinal data

Ruth Keogh (LSHTM) and Stijn Vansteelandt (University of Ghent)

In longitudinal studies in which measures of exposures and outcomes are observed at repeated visits, interest may lie in studying short term or long term exposure effects. A short term effect is defined as the effect of an exposure at a given time on a concurrent outcome. Long term effects are the effects of earlier exposures on any subsequent outcome, and interest may be in two types of long term effect: (1) the total effect of an exposure at a given time on a subsequent outcome, including both indirect effects mediated through intermediate exposures and direct effects; (2) the joint effects of exposures at different time points on a subsequent outcome, which requires a separation of direct and indirect effects.

The emphasis in the statistical causal inference literature has been on studying joint effects and in particular on special methods for handling the complications of time-dependent confounding which occur in this situation when time-varying confounders are affected by past outcomes, which cannot be handled by standard regression adjustment. However, investigating short term or total exposure effects provides a simpler starting point. Moreover, these effect estimates may often be the most useful, for example for a doctor making a decision about starting a patient on a treatment. In this work we have shown how, with careful control for confounding by past exposures, outcomes and time-varying covariates, short term and total effects can be estimated using conventional regression analysis. This approach is based on sequential conditional mean models (CMM) including an extension to include propensity score adjustment. We have used simulation studies to compare this approach with IPW, finding that sequential CMMs give more precise estimates than IPW and provide double-robustness via propensity score adjustment. As part of this work we have also developed a new test of whether there are direct effects of past exposures on a subsequent outcome not mediated through intermediate exposures.

A manuscript is forthcoming.

Propensity score adjustment and strong confounding

Rhian Daniel (LSHTM) and Stijn Vansteelandt (University of Ghent)

Regression adjustment for the propensity score (p(C)=Pr(X=1|C), where X is the exposure and C are confounders) is a rarely-used alternative to the other propensity score methods, namely stratification, matching and inverse weighting. In recent work, we clarified the rationale for its use, in particular for estimating the information-standardised effect:

E{w(C)(Y1-Y0)}  (*)

where w(C)=[p(C){1-p(C)}] / E[p(C){1-p(C)}], since in GLMs, adjustment for the propensity score leads to a consistent estimator of (*) if the propensity score model is correctly specified even if its conditional relationship with the outcome (in the GLM) has been misspecified.

The estimand (*) is attractive in settings with strong confounding, since it gives greatest weight to those in the centre of the propensity score distribution, and least weight to those who, on the basis of their confounders were nearly certain either to be exposed or not, about whom the observational data carry very little information on the treatment effect. When there is strong confounding, so that some subjects are bound to be exposed/unexposed estimating the more usual estimand E(Y1-Y0) may be too ambitious, and anyway may not be of interest, since it requires asking what would happen if everyone were exposed, even though we know that some subjects, on the basis of their confounders, will never be exposed.

In on-going work, we are extending this thinking to longitudinal studies where the problem of strong confounding becomes arguably even more acute. Typically, the regimes that are compared by g-methods are “always treat”, “never treat” etc. More pragmatic estimands may be sensible in situations where very few subjects have a propensity to be always/never treated.

Vansteelandt S, Daniel RM (2014) On regression adjustment for the propensity score. Statistics in Medicine; 33(23):4053–4072.

Causal inference and missing data

There are many links between the concepts and methods used in the fields of causal inference and missing data, and several LSHTM researchers are working on this intersection:

A doubly robust estimator to handle missing data and confounding simultaneously, with a focus on data from e-health records

Elizabeth (Fizz) Williamson (LSHTM)

Fizz recently developed a doubly robust estimator that combines an element of robustness to the models used to handle missing data with an element of robustness to the models used to handle confounding. She is currently extending this estimator to more realistic scenarios, particularly those with several partially missing variables. She is also working on a series of projects investigating methods for handling missing data within propensity score analyses, with an emphasis on analyses using data drawn from electronic health records.

Williamson EJ, Forbes A, Wolfe R (2012) Doubly robust estimators of causal exposure effects with missing data in the outcome, exposure or a confounder. Statistics in Medicine31(30):4382–4400.

Methods for estimating treatment effect when there are departures from protocol (non-compliance and missing data) in a randomised trial

Karla Diaz-Ordaz (LSHTM)

Mendelian randomization

The technique of Mendelian randomization is used in applied research at LSHTM, particularly in the field of cardiovascular and other non-communicable diseases. Alongside this, methodological work inspired by the applied problems is also carried out.

Investigating non-linear effects with Mendelian randomization

Richard Silverwood and Frank Dudbridge (LSHTM)

Mendelian randomization studies have so far restricted attention to linear associations relating the genetic instrument to the exposure, and the exposure to the outcome, but this may not always be appropriate. For example, alcohol consumption is consistently reported as having a U-shaped association with cardiovascular events in observational studies. Richard Silverwood, Frank Dudbridge and others (Silverwood et al, 2014), proposed a novel method to assess non-linear causal effects using a binary genotype in Mendelian randomization studies based on estimating local average treatment effects for discrete levels of the exposure range, then testing for a linear trend in those effects. Their method gave a conservative test for non-linearity under realistic violations of the key assumption in extensive simulations, making their method useful for inferring departure from linearity when only a binary instrument is available. They found evidence for a non-linear causal effect of alcohol intake on several cardiovascular traits in the Alcohol-ADH1B Consortium, using the single nucleotide polymorphism rs1229984 in ADH1B as a genetic instrument.

Silverwood RJ, Holmes MV, Dale CE, et al. (2014) Testing for non-linear causal effects using a binary genotype in a Mendelian randomization study: application to alcohol and cardiovascular traits. International Journal of Epidemiology; B(6):1781-90.

Causal inference for Health Economics

One of the most active causal inference research groups at the school is led by Richard Grieve, and focuses on the area of health economics. As such, there is overlap with the CSM theme of the same name. The methodological aspects of this work are outlined below.

Causal inference approaches for handling external validity, estimating continuous treatments, and handling aspects of time-varying confounding

Richard Grieve, Noemi Kreif, Karla Diaz-Ordaz, Manuel Gomes, Zia Sadique (LSHTM) and Jasjeet Sekhon (UC Berkeley)

LSHTM researchers are extending causal inference approaches to: identify populating treatment effects from RCTs, estimate the effects of ‘continuous’ treatments and handle aspects of time-varying confounding in evaluating new health policies from observational data.

A general concern is that RCTs may fail to provide unbiased estimates of population average treatment effects. We have derived the requisite assumptions to identify population average treatment effects from RCTs. Our research provides placebo tests, which formally follow from the identifying assumptions and can assess whether they hold. We offer new research designs for estimating population effects that use non-randomised studies to adjust the RCT data. This approach is illustrated in a study evaluating the clinical and cost-effectiveness analysis of a clinical intervention: pulmonary artery catheterisation (see Hartman et al, 2015).

These placebo tests reveal that in some trial settings, the requisite underlying assumptions for estimating population treatment effects are not satisfied. This external validity concern was illustrated by an RCT of an intervention in primary care, ‘Telehealth’ in which only 20% of eligible patients agreed to participate. To address the external validity issue, we developed sensitivity analyses that combine RCT and observational data to re-estimate treatment effects (Steventon et al, 2015).

When evaluating the effects of continuous treatments (for example according to different dosages of drug), the generalised propensity score (GPS) can be used to adjust for confounding. However, an unbiased estimation of the dose-response function assumes that both the GPS, and the outcome-treatment relationship have been correctly specified. We introduce a machine learning method, the “Super Learner” for model selection in estimating continuous treatment effects. We compare this Super Leaner approach to parametric implementations of the GPS, and to outcome regression methods, in a re-analysis of the Risk Adjustment In Neurocritical care (RAIN) cohort study. Our paper highlights the importance of principled model selection for applied empirical analysis (Kreif et al 2015).

A further strand of research considers alternative approaches for handling confounding in studies where outcomes are measured before and after an intervention. Our research contrasts the synthetic control method for the evaluation of health policies, with difference-in-differences (DiD) estimation. The synthetic control approach estimates treatment effects by constructing a weighted combination of control units, to represent outcomes the treated group would have experienced in the absence of receiving the treatment. DiD estimation assumes that pre-treatment, the outcomes between the treated and control groups follow parallel trends over time, whereas the synthetic control method allows for non-parallel trends. We extend the synthetic control approach to settings where there are multiple treated units (for example hospitals), in re-evaluating the effects of a recent hospital pay-for-performance (P4P) scheme on risk-adjusted hospital mortality. Ongoing research is contrasting the synthetic control, and DiD approaches with matching and regression methods.

Hartman, E., Grieve, R., Ramsahai, R. and Sekhon, JS. (2015). From sample average treatment effect to population average treatment effect on the treated: combining experimental with observational studies to estimate population treatment effects. JRSSA doi: 10.1111/rssa.12094

Free text available from:

Steventon A, Grieve R, Bardsley M (2015). An approach to assess generalizability in comparative effectiveness research: a case study of the Whole Systems Demonstrator cluster randomized trial comparing telehealth with usual care for patients with chronic health conditions. Medical Decision Making (in press).

Kreif, N, Grieve, R, Díaz, I, Harrison, D (2015). Evaluation of the effect of a continuous treatment: a machine learning approach with an application to treatment for traumatic brain injury. Health Economics (in press). Submitted version available as working paper from:

The first is the average causal effect (ACE) of X on Y expressed as a marginal risk difference and the second is the average causal effect in the exposed (also called the average treatment effect in the treated, or ATT) expressed as a risk ratio (marginal wrt confounders C). The third is a conditional causal log odds ratio, given C.

Sufficient conditions for these and other similar parameters to be identified can also be expressed in terms of potential outcomes. For the ACE, for example, these are:

Consistency: For x=0,1, if X=x then Yx=Y Conditional exchangeability: For x=0,1, Yx ╨ X | C
Health Economics

Theme Co-ordinator: Richard GrieveLinda Sharples

Please see here for slides and audio recordings of previous seminars relating to this theme.


Policy-makers in many countries require accurate estimates of effectiveness and cost-effectiveness, to inform clinical guidelines, and for deciding which public health interventions and health care technologies to provide. Health economic evaluations may provide misleading evidence because they fail to address issues such as low external validityconfoundingnon-compliancemissing data and clustering.

Our research programme draws on insights from the causal inference literature, to propose new approaches for providing accurate estimates of effectiveness and cost-effectiveness. Most of our researchers are based within the Team for Health Economics, Policy, and Technology Assessment.

Within the overall theme, the following areas are of specific interest:


Health Economic evaluations commonly use observational studies, either alongside or instead of data from randomised controlled trials (RCTs). A major concern is that the results suffer from treatment selection bias due to confounding variables that influence both treatment and outcomes. Commonly used analytical methods for dealing with confounding such as regression analysis or propensity score matching can be highly sensitive to model specification.

We are undertaking research on improving methods in economic evaluations for dealing with confounding. Our recent and ongoing research programme in this area, covers the following topics:

  • An assessment of the relative performance of a multivariate matching approach, Genetic Matching in contrast to more conventional propensity score matching estimators [1-4]
  • An evaluation of double-robust methods for addressing confounding in evaluations of effectiveness and cost-effectiveness [5-6]
  • An investigation of a machine learning approach for evaluating the effectiveness of continuous treatments [7]
  • A critical examination of the relative merits of the synthetic control method [8] for estimating treatment effects in longitudinal settings [8-9]
  • Comparing approaches for addressing time-varying confounding when informing sequential decisions
  • Considering mediation approaches in the context of decision-modelling and cost-effectiveness analysis?

Key people: Richard GrieveNoemi KreifStephen O’NeillNeil Hawkins.


Recommendations encourage cost-effectiveness analyses (CEAs) to report intention to treat (ITT) estimates, but this may be insufficient for policy making as new decision problems may arise subsequent to the trial design. Clinical decision-makers also require cost-effectiveness estimates for patients who meet particular treatment thresholds, or for lower levels of compliance, which better reflect routine practice. Per protocol (PP) analyses are common but will provide biased estimates.

We aim to improve methods for estimating causal treatment effects after deviation from protocol. In particular, the instrumental variables (IV) approach we propose can deal with non-compliance in settings where there are multiple endpoints.

Key people: Richard Grieve

External validity

Evaluations that use RCTs lack external validity if the RCT and the target population differ according to patient and provider characteristics that modify relative cost-effectiveness. We lead NIHR-funded evaluations where clinical investigators suggest the potential lack of external validity could stop them applying the results in practice. We have developed methods that make plain the assumptions required to generalise effectiveness and cost-effectiveness estimates from an RCT, to target populations of prime interest [10].

Key people: Richard Grieve

Missing data

In economic evaluations a common problem is that there are missing data, for example, because patients are lost to follow-up or fail to respond to quality-of-life or resource use questionnaires. Missing data may be problematic because individuals with missing information tend to be systematically different from those with complete data. Most published studies fail to address this issue and report cost-effectiveness inferences based solely on the complete cases. Inappropriate methods will lead to biased results, and ultimately can affect the decision of whether an intervention should be prioritised.

While standard multiple imputation methods have been proposed for handling missing data in cost-effectiveness analysis, these may be insufficient in many settings. For example, they assume that individual observations are independent (which may be implausible in multicentre studies or meta-analysis of individual-participant data) or that the imputation model is correctly specified. In addition, the methods proposed assumed that data are missing at random, i.e. the probability of missingness is only conditional on the observed data. However, the probability of missing costs or outcomes may depend on unobserved values, i.e. data may be missing not at random.

We focus on the following areas:

Key people: Alexina MasonManuel GomesRichard Grieve


CEAs often use data from cluster randomised trials (CRTs), where randomisation is at the level of the cluster (for example the hospital) rather than the individual. Here, statistical methods are required that recognise within-cluster costs and health outcomes may be correlated. However, most CEA alongside CRTs use methods that assume observations are independent, which may lead to incorrect inferences.

Our research interests include:

Key people: Manuel GomesRichard Grieve


1. Sekhon, Jasjeet. Multivariate and propensity score matching with automated balance search. Journal of Statistical Software 2011

2. Sekhon, Jasjeet, and Richard D. Grieve. “A matching method for improving covariate balance in cost‐effectiveness analyses.”Health economics 21, no. 6 (2012): 695-714.

3. Radice, Rosalba, Roland Ramsahai, Richard Grieve, Noemi Kreif, Zia Sadique, and Jasjeet S. Sekhon. “Evaluating treatment effectiveness in patient subgroups: a comparison of propensity score methods with an automated matching approach.”The international journal of biostatistics8, no. 1 (2012).

4. Kreif, Noemi, Richard Grieve, Rosalba Radice, Zia Sadique, Roland Ramsahai, and Jasjeet S. Sekhon. “Methods for estimating subgroup effects in cost-effectiveness analyses that use observational data.”Medical Decision Making 32, no. 6 (2012): 750-763.

5. Kreif, Noémi, Richard Grieve, Rosalba Radice, and Jasjeet S. Sekhon. “Regression-adjusted matching and double-robust methods for estimating average treatment effects in health economic evaluation.”Health Services and Outcomes Research Methodology13, no. 2-4 (2013): 174-202.

6. Kreif, Noémi, Susan Gruber, Rosalba Radice, Richard Grieve, and Jasjeet S. Sekhon. “Evaluating treatment effectiveness under model misspecification: a comparison of targeted maximum likelihood estimation with bias-corrected matching.”Statistical methods in medical research(2014): 0962280214521341.

7. Kreif, Noémi, Richard Grieve, Iván Díaz, and David Harrison. “Evaluation of the Effect of a Continuous Treatment: A Machine Learning Approach with an Application to Treatment for Traumatic Brain Injury.”Health economics24, no. 9 (2015): 1213-1228.

8. Abadie, Alberto, Alexis Diamond, and Jens Hainmueller. “Synthetic control methods for comparative case studies: Estimating the effect of California’s tobacco control program.”Journal of the American Statistical Association105, no. 490 (2010).

9. Kreif, Noémi, Richard Grieve, Dominik Hangartner, Alex James Turner, Silviya Nikolova, and Matt Sutton. “Examination of the Synthetic Control Method for Evaluating Health Policies with Multiple Treated Units.”Health economics (2015). doi: 10.1002/hec.3258.

10. Hartman, Erin, Richard Grieve, Roland Ramsahai, and Jasjeet S. Sekhon. “From sample average treatment effect to population average treatment effect on the treated: combining experimental with observational studies to estimate population treatment effects.”Journal of the Royal Statistical Society: Series A (Statistics in Society)178, no. 3 (2015): 757-778.

11. Díaz‐Ordaz, Karla Michael G. Kenward, and Richard Grieve. “Handling missing values in cost effectiveness analyses that use data from cluster randomized trials.”Journal of the Royal Statistical Society: Series A (Statistics in Society)177, no. 2 (2014): 457-474.

12. Gomes, Manuel, Edmond S-W. Ng, Richard Grieve, Richard Nixon, James Carpenter, and Simon G. Thompson. “Developing appropriate methods for cost-effectiveness analysis of cluster randomized trials.”Medical Decision Making 32, no. 2 (2012): 350-361.

13. Gomes, Manuel, Richard Grieve, Richard Nixon, Edmond S‐W. Ng, James Carpenter, and Simon G. Thompson. “Methods For Covariate Adjustment In Cost‐Effectiveness Analysis That Use Cluster Randomised Trials.”Health economics21, no. 9 (2012): 1101-1118.

14. Ng, Edmond SW, Karla Diaz-Ordaz, Richard Grieve, Richard M. Nixon, Simon G. Thompson, and James R. Carpenter. “Multilevel models for cost-effectiveness analyses that use cluster randomised trial data: an approach to model choice.”Statistical methods in medical research(2013): 0962280213511719.

Missing Data and Measurement Error

Theme Co-ordinators: Ruth KeoghJames CarpenterKarla Diaz-OrdazChris Frost

Please see here for slides and audio recordings of previous seminars relating to this theme.

Missing data


The problem of missing data is almost ubiquitous in medical research, in both observational studies and randomized trials. Until the advent of sufficiently powerful computers, much of the research in this area was focused on the problem of how to handle, in a practicable way, the lack of balance caused by incompleteness. A example of such a development was the key idea of the EM algorithm (Dempster et al 1976). As routine computation became less of a problem, attention moved to the much more subtle issue of the consequences of missing data on the validity of subsequent analyses. The seminal work was Rubin (1976), from which all subsequent work in this area has developed to a greater or lesser degree.

Although the underlying missing data concepts are the same for observational and randomized studies, the emphases differ somewhat in practice in the two areas. However, both are the subject of development within the Centre. From 2002, supported by several grants from the Economic and Social Research Council, an entire programme has been developed around the handling of missing data in observational studies. This includes the development of multiple imputation in a multilevel setting (e.g. Goldstein et al 2009, Carpenter et al 2010), a series of short courses, and the establishment of a leading website devoted to the topic:

which contains background material, answers to frequently asked questions, course notes, software, details of upcoming courses and events, a bibliography, and a discussion forum.

A central problem in the clinical trial setting is the appropriate handling of dropout and withdrawal in longitudinal studies. This has been the subject of great debate among academics, trialists and regulators for the last 10-15 years. Members of the centre have had long involvement in this (e.g. Diggle and Kenward 1994, Carpenter et al 2002). A textbook was published by Wiley on the broad subject of missing data in clinical studies (Molenberghs and Kenward 2007). More recently the UK NHS National Co-ordinating Centre for Research on Methodology commissioned a monograph on the subject which was published in 2008 (Carpenter and Kenward 2008). Members of the Centre are also actively involved in current regulatory developments. Two important documents have recently appeared. In the US an FDA commissioned National Research Council Panel on Handling Missing Data in Clinical Trials, chaired by Professor Rod Little, produced in 2010 a report, ‘The Prevention and Treatment of Missing Data in Clinical Trials.’ James Carpenter was one of several experts invited to give a presentation to this panel. Implementation of the guidelines in this report is to be discussed at the 5th Annual FDA/DIA Statistics Forum in April 2011, where Mike Kenward is giving the one day pre-meeting tutorial on missing data methodology. In Europe, again in 2010, the CHMP released their ‘Guideline on Missing Data in Confirmatory Clinical Trials’. James Carpenter, Mike Kenward and James Roger were members of the PSI working party that provided a response to the draft of this document (Burzykowski T et al. 2009).

At the School there continues a broad research programme in both the observational study and randomized trials settings, and there is an active continuing programme of workshops. Missing data is an issue for many of the studies run and analysed within the School and there is much cross-fertilization across different research areas. There are also strong methodological links with other themes, especially causal inference, indeed one recent piece of work explicitly connects the two areas (Daniel et al. 2011).


Those most directly involved in missing data research are:

Jonathan Bartlett, James Carpenter, Mike Kenward, James Roger (honorary), and two research students: Mel Smuk and George Vamvakis.

Many others have an interest in, and have contributed to, the area, including Rhian Daniel, Bianca de Stavola, George Ploubidis, and Stijn Vansteelandt (honorary).


Burzykowski T et al. (2009). Missing data: Discussion points from the PSI missing data expert group. Pharmaceutical Statistics. DOI: 10.1002/pst.391

Carpenter JR, Goldstein H and Kenward MG (2010). REALCOM-IMPUTE software for multilevel multiple imputation with mixed response types. Journal of Statistical Software, to appear.

Carpenter JR and Kenward MG (2008). Missing data in clinical trials – a practical guide. National Health Service Coordinating Centre for Research Methodology: Birmingham. Downloadable from….

Carpenter J, Pocock S and Lamm C (2002). Coping with missing values in clinical trials: a model based approach applied to asthma trials Statistics in Medicine, 21, 1043-1066.

Daniel RM, Kenward MG, Cousens S, de Stavola B (2009) Using directed acyclic graphs to guide analysis in missing data problems. Statistical Methods in Medical Research, to appear.

Dempster AP Laird NM and Rubin DB (2007). Maximum likelihood from incomplete data via the EM algorithm (with discussion). Journal of the Royal Statistical Society, Series B, 39, 1-38.

Diggle PJ and Kenward MG (1994). Informative dropout in longitudinal data analysis (with discussion). Applied Statistics, 43, 49-94.

Goldstein H, Carpenter JR, Kenward MG and Levin K (2009). Multilevel models with multivariate mixed response types. Statistical Modelling, 9, 173-197.

Molenberghs G and Kenward MG (2007). Missing Data in Clinical Studies. Chichester: Wiley.

Rubin DB (1976). Inference and missing data. Biometrika, 63, 581-592.

Measurement Error


The measurement of variables of interest is central to epidemiological study. Often, the measurements we obtain are noisy error-prone versions of the underlying quantity of primary interest. Such errors can arise due to technical error induced by imperfect measurement instruments and short-term fluctuations over time. An example is a single measurement of blood pressure, considered as a measure of an individual’s underlying average blood pressure. Variables obtained by asking individuals to answer questions about their behaviour or characteristics are also often subject to error, either due to the individual’s inability to accurately recall the behaviour in question or a tendency, for whatever reason, to over-estimate or under-estimate the quantity being requested.

The consequences of measurement error in a variable depend on the variable’s role in the substantive model of interest (Carroll et al). For example, independent error in the continuous outcome variable in a linear regression does not cause bias. In contrast, measurement error in the explanatory variables of regression models does cause bias, in general. Measurement error in an exposure of interest may distort estimates of the exposures effect on the outcome of interest, while error in confounders will lead to imperfect adjustment for confounding, leading to biased estimates of the effect of an exposure.

When explanatory variables in regression models are categorical the analogy of measurement error is misclassification. Unlike measurement errors, which can often plausibly be assumed to be independent of underlying true levels, a misclassification error is never independent of the underlying value of the predictor variable and so different theory covers the effects of misclassification and measurement errors (White et al).

Over the past thirty years a vast array of methods has been developed to accommodate measurement errors  and misclassification in statistical analysis models. While simple methods include method of moments correction and regression calibration have sometimes been applied in epidemiological research, more sophisticated approaches, such as maximum likelihood (Bartlett et al) and semi-parametric methods (Carroll et al), have received less attention. This is likely partly due to a relative scarcity of implementation in statistical software packages.

Areas for future research efforts

Greater recognition of the effects of measurement error and misclassification in the analysis of epidemiological and clinical studies.

Increasing the accessibility of methods to deal with measurement error, through dissemination of methods and the implementation of methods into statistical software.

Development of methods that allow for the effects of measurement errors in causal models that describe how risk factors, and therefore risks of disease, change over time.


Bartlett J. W., De Stavola B. L., Frost C. (2009). Linear mixed models for replication data to efficiently allow for covariate measurement error. Statistics in Medicine; 28: 3158-3178.

Carroll R. J., Ruppert D., Stefanski L. A., Crainiceanu C. M. (2006). Measurement error in nonlinear models. Chapman & Hall/CRC, Boca Raton, FL, US.

Frost C., Thompson S. G. (2000). Correcting for regression dilution bias: comparison of methods for a single predictor variable. Journal of the Royal Statistical Society A; 163: 173-189.

Frost C., White I. R. (2005). The effect of measurement error in risk factors that change over time in cohort studies: do simple methods overcorrect for `regression dilution’?. International Journal of Epidemiology; 34: 1359-1368.

Gustafson, P. (2003). Measurement Error and Misclassification in Statistics and Epidemiology: Impacts and Bayesian Adjustments. Chapman and Hall/CRC Press.

White I., Frost C., Tokunaga S. (2001). Correcting for measurement error in binary and continuous variables using replicates. Statistics in Medicine; 20:3441-3457

Knuiman M. W., Divitini M. L., Buzas J. S., Fitzgerald P. E. B. (1998). Adjustment for regression dilution in epidemiological regression analyses. Annals of Epidemiology; 8: 56-63.

Survival Analysis

Theme Co-ordinators: Bernard RachetAurelien Belot

Please see here for slides and audio recordings of previous seminars relating to this theme.


Survival analysis is at the core of any study of time to a particular event, such as death, infection, or diagnosis of a particular cancer. It is therefore fundamental to most epidemiological cohort studies, as well as many randomised controlled trials (RCTs).

An important issue in survival analysis is the choice of time scale: this could be for example time since entry into the study (or since first treatment in a RCT), time since a particular event (e.g. the Japanese tsunami), or time since birth (i.e. age).  The latter is particularly relevant for epidemiological studies of chronic diseases, where age often exerts a substantial confounding effect (see [1], Chapter 6, for a discussion of alternative time scales).

Usually not all participants are followed up until they experience the event of interest, leading to their times being ‘censored‘. In this case, the available information consists only of a lower bound for their actual event time. It is typically assumed that the process giving rise to censoring is independent of the process determining time to the event of interest. In contrast to most regression approaches (which typically involve modelling means of distributions given explanatory variables), many survival analysis models are defined in terms of the hazard (or rate) of the event of interest. Within this framework, the hazard is expressed as a function of explanatory variables and an underlying ‘baseline’ hazard. Fully parametric models assume a particular form for the baseline hazard, the simplest being that it is constant over time (Poisson regression). Cox’s proportional hazards model, perhaps the most popular model for survival data, makes no parametric assumptions about the baseline hazard. Both the Poisson and Cox regression models assume the hazards to be proportional for individuals with different values of the explanatory variables. This assumption can be relaxed, for example through use of Aalen’s additive hazard model.

Generalizations to deal with repeated episodes of an event of interest, such as infection, are possible through the introduction of random effects that capture the correleation among events that occur to the same individual. Within the survival analysis literature these are referred to as frailty models. → Design and analysis for dependent data

An alternative approach to modelling survival data, more in keeping with most regression techniques, involves modelling the (logarithmically transformed) survival times directly. These are expressed in terms of as a linear function of explanatory variables and an error term, with a choice of distributions for the error terms leading to the family of accelerated failure time models. When the errors are assumed to be exponential, the accelerated failure time model is equivalent to a Poisson regression model.

Most of our applications of survival analysis models involve various flavours of the models mentioned above. However specific issues arise in certain contexts and are of interest to our group. These are discussed below.

Areas of current interest

Competing events

Censoring may occur for several reasons. A particular setting where censoring is not independent of the process governing the event of interest arises when there are competing events. Competing events are events that remove the individual from being at risk of the event of interest, in other words they preclude its occurrence. This happens for example if we study lung cancer mortality while individuals may die of other causes. Obviously the termination of the follow-up of individuals who die from other causes is not the same as loss to follow-up because the latter does not prevent the occurrence of the event of interest after time is censored.

The issues and methods arising for the analysis of competing events have been discussed in the biostatistical literature since the 1980s, (for a review see [2]) but have not really filtered into epidemiological practice, with the notable exception of applications to AIDS research [3]. They are only marginally discussed in the RCT literature, where the problem is usually dealt with by creating composite events. → Analysis of clinical trials

There are two main possible approaches to the analysis of data affected by competing events:

a) Carrying out a so-called ‘cause-specific‘ analysis, that is adopt traditional survival analysis methods where competing events are treated as censoring events. Note however that ’cause-specific’ in this context is a misnomer  since the estimated effect depends on the rates generating all the other events (see [1], page 66). The main issue with this is approach is one of interpretation,  as all estimated effects are conditional on suffering the competing event.

b) Adopting a different focus, that is model the cumulative incidence of the event of interest as opposed to its hazard (or rate).  This approach was first proposed by Fine and Gray [4] but belongs to the broader family of inverse probability weighting (IPW) estimators (e.g. [5]) that has also been proposed in other contexts, notably to deal with informative missingness and selection bias [6-7]. → Causal inferenceMissing data and measurement error

Net survival

Information on cancer survival is essential for cancer control and has important implications for cancer policy. The primary indicator of interest is net survival, a conceptual survival metric which would be observed if the patients were only subject to the mortality from the disease of interest and the mortality rate of this disease remained as in the context of analyses involving competing events, the only situation which can be observed.

Two approaches attempt to estimate net survival: cause-specific survival and relative survival. Relative survival [8] is the standard approach of estimating population-based cancer survival, when the actual cause of death is not accurately known. Although widely used in the cancer field, it can be applied to any disease at population level. Relative survival was originally defined as the ratio of the observed survival probability of the cancer patients and the survival probability that would have been expected if the patients had had the same mortality probability as the general population (background mortality) with similar demographic variables e.g. age, sex, calendar year. Background mortality is derived from life tables stratified at least by age, sex and calendar time.

Unbiased estimator of net survival
Both approaches (cause-specific and relative survival) provide biased estimation of net survival because of the competitive censoring in particular due to age. An unbiased descriptive estimator of net survival using the principle of inverse probability weighting has been recently proposed alongside the modelling approach (Pohar-Perme M, Stare J, Estève J. Biometrics 2011 – in review].

Multivariable excess hazard models
Relative survival is the survival analogue of excess mortality. Additive regression models for relative survival estimate the hazard at time t since diagnosis of cancer, as the sum of the expected hazard (background) of the general population at time t, and the excess hazard due to cancer [9-11]. More flexible models using splines for modelling the baseline excess hazard function of death as well as the non-proportionality of the co-variables effects have been recently developed [12-14]; modelling the log-cumulative excess hazard has been also proposed [15-16]. Alternative approaches were recently developed [17].

Unbiased estimation of net survival requires the inclusion of the main censoring variables in the excess hazard models, variables usually included in the life tables [18].

Current Work

Life tables
Estimation of net survival relies on accurate life tables. Methodology based on multivariable flexible Poisson model has been developed in order to build complete, smoothed life tables for subpopulations, as defined by region, deprivation, ethnicity etc. [19].

Survival on sparse data
Contrasting with incidence and mortality, very little has been done on the estimation of survival based on sparse data or small areas [20]. The main challenge in survival is the additional dimension that is time since diagnosis. Multilevel modelling and Bayesian approaches are two main possible routes. Ultimately, presentation of such survival results can easily mislead healthcare policy makers and methodological work on mapping and funnel plots is needed [21].

Public health relevance
Several indicators (avoidable deaths, population ‘cure’ parameters, crude probability of death, partitioned excess mortality) have been explored to present cancer survival results in ways more relevant for public health and health policy.

Missing data and misclassification
The analysis of routine, population-based data always face the problem of incomplete data for which it may be difficult or impossible to obtain the required complementary information. A tutorial paper explored the estimation of relative survival when the data are incomplete [22]. Even when complete, tumour stage in particular may be misclassified, compromising comparison in cancer survival between subpopulations.

Disparities in cancer survival
Inequalities in cancer survival are still not well understood and structural equation modelling appears to be a possible approach to investigate potential causal pathways.


1. Clayton D and Hills M. Statistical Models in Epidemiology. Oxford University Press, 1993, Oxford.

2. Putter, H., Fiocco, M., and Geskus, R. B. Tutorial in biostatistics: Competing risks and multi-state models. Statistics in Medicine. 2007: 262389–2430.

3. CASCADE Collaboration. Effective therapy has altered the spectrum of cause specific mortality following HIV seroconversion. AIDS, 2006, 20:741–749

4. Fine, JP and Gray R J. A proportional hazards model for the subdistribution of a competing risk. Journal of the American Statistical Association. 1999: 94496–509.

5. Klein JP, Andersen PK. Regression Modeling of Competing Risks Data Based on Pseudovalues of the Cumulative Incidence Function. Biometrics 2005: 61, 223–229.

6. Robins JM, et al. Semiparametric regression for repeated outcomes with non-ignorable non-response. Journal of the American Statistical Association. 1998; 93 1321-1339.

7. Hernán MA, Hernandez-Diaz S, Robins JM. A structural approach to selection bias. Epidemiology 2004;15:615-625.

8. Ederer F, Axtell LM, Cutler SJ. The relative survival: a statistical methodology. Natl Cancer Inst Monogr 1961; 6: 101-21.

9. Hakulinen T, Tenkanen L. Regression analysis of relative survival rates. J Roy Stat Soc Ser C 1987; 36: 309-17.

10. Estève J, Benhamou E, Croasdale M, Raymond L. Relative survival and the estimation of net survival: elements for further discussion. Stat Med 1990; 9: 529-38.

11. Dickman PW, Sloggett A, Hills M, Hakulinen T. Regression models for relative survival. Stat Med 2004; 23: 51-64.

12. Bolard P, Quantin C, Abrahamowicz M, Estève J, Giorgi R, Chadha-Boreham H, Binquet C, Faivre J. Assessing time-by-covariate interactions in relative survival models using restrictive cubic spline functions. J Cancer Epidemiol Prev 2002; 7: 113-22.

13. Giorgi R, Abrahamowicz M, Quantin C, Bolard P, Estève J, Gouvernet J, Faivre J. A relative survival regression model using B-spline functions to model non-proportional hazards. Stat Med 2003; 22: 2767-84.

14. Remontet L, Bossard N, Belot A, Estève J, FRANCIM. An overall strategy based on regression models to estimate relative survival and models to estimate relative survival and model the effects of prognostic factors in cancer survival studies. Stat Med 2007; 26: 2214-28.

15. Nelson CP, Lambert PC, Squire IB, Jones DR. Flexible parametric models for relative survival, with application in coronary heart disease. Stat Med 2007; 26: 5486-98.

16. Lambert PC, Royston P. Further development of flexible parametric models for survival analysis. Stata J 2010; 9: 265-90.

17. Perme MP, Henderson R, Stare J. An approach to estimation in relative survival regression. Biostatistics 2009; 10: 136-46.

18. Estève J, Benhamou E, Raymond L. Statistical methods in cancer research, volume IV. Descriptive epidemiology. (IARC Scientific Publications No. 128). Lyon: International Agency for Research on Cancer, 1994.

19. Cancer Research UK Cancer Survival Group. Life tables for England and Wales by sex, calendar period, region and deprivation., 2004.

20. Quaresma M, Walters S, Gordon E, Carrigan C, Coleman MP, Rachet B. A cancer survival index for Primary Care Trusts. Office for National Statistics, 7 Sep 2010.

21. Spiegelhalter DJ. Funnel plots for comparing institutional performance. Statistics in Medicine 2005; 24: 1185-202.

22. Nur U, Shack LG, Rachet B, Carpenter JR, Coleman MP. Modelling relative survival in the presence of incomplete data: a tutorial. IJE 2010; 39: 118-28.

Time Series Regression Analysis

Theme Co-ordinators: Antonio GasparriniBen Armstrong

Please see here for slides and audio recordings of previous seminars relating to this theme.

This page is split into the following sections:

  1. Time series analysis for biomedical data
  2. Methodological issues
  3. Contributions of LSHTM researchers
  4. LSHTM people involved in developing or using time series regression methodology
  5. Publications by LSHTM researchers
  6. Key references on methods

1. Time series analysis for biomedical data

A time series may be defined as a sequence of measurements taken at (usually equally-spaced) ordered points in time.

Statistical methods applied to time series data were originally developed mainly in econometrics, and then used in many other fields, such as ecology, physics and engineering. In the original application the focus was in prediction, and the aim was to produce an accurate forecast of future measurements given an observed series. The standard statistical approaches adopted for this purpose usually rely on auto-regressive moving average (ARIMA) and related models.

Time series designs are increasingly being exploited in biomedical data, due to the availability of routinely-collected series of administrative or medical data, such as mortality or morbidity counts, environmental measures, changes in socio-economic or demographic indices. Within this research area, time series methods have been subject to intense methodological developments over the last 20 years. In contrast with the original interest on prediction, the main aims of time series analysis in biomedical applications is commonly to assess the association between an outcome and either a predictor series or an intervention: here the focus is instead in estimation, and the models reduce to the more traditional regression framework although possibly non-standard versions.

Two main features characterize time series data from a statistical viewpoint: the correlation displayed by observations and their temporal sequence. Statistical models need to cope with the former, in order to provide accurate inferences, and may exploit the latter, with the intention to strengthen the evidence on the causal nature or clarify details of the association under study.

2. Applications of time series regression

2.1 Time series regression of short-term associations

A topic of intense methodological research and applications of time series analysis is the study of short-term health associations. In particular, time series methods have been hugely applied in environmental epidemiology during the last decades to investigate the acute health effects of air pollution, and more recently outdoor temperature and other weather parameters. This approach exploits well-known decomposition techniques of time series data, which filter out long-term and seasonality trends in the analysis of short-term dependencies between time-varying environmental factors and health outcomes. This method controls by design for time-fixed factors or other confounders that change slowly in time.

Time series studies of short-term associations compare the outcome and exposure series, such as in the example below illustrating the daily variation in mortality counts and outdoor temperature in a 14 years period in New York. The main methodological issues in this approach are selection of smoothing methods for the decomposition of the series, the presence and estimation of delayed effects and the potential confounding by other time-varying factors.

2.2 Interrupted time series for evaluating interventions or events

The importance of robust evaluation of public health interventions is increasingly recognised, yet public health interventions are often complex and evaluation methods used in clinical medicine (such as randomised controlled trials) are not always feasible. Other ‘quasi-experimental’ designs are therefore needed in order to explore the effect of an intervention on health outcomes, one of the strongest of which is the interrupted time series (ITS) design. ITS requires a series of observations taken repeatedly over time before and after an intervention. The underlying trend in the outcome is established and can be used to estimate the counterfactual, that is, what would have happened if the intervention had not taken place. The impact of the intervention is then assessed by examining any change in the post-intervention period given the trend in the pre-intervention period. The intervention may lead to a change in level, a change in slope or both. This framework is illustrated in the figure below.

Interrupted time series can be used to explore the impact of public health interventions or unplanned events. In the example illustrated in the figure below, an ITS design is adopted to assess the impact of the financial crisis in Spain on suicides. The main methodological issues surrounding the ITS design are represented by assumptions and methods to model trends and control for potential time-varying confounders of the before-after comparison.

Trend in monthly suicide rates for all of Spain before and since the financial crisis (Lopez Bernal et al 2013)

3. Methodological issues

The regression analysis of time series biomedical data poses several methodological problems, which result in an intense research carried out in the last few years. The main research directions are summarized below. References are provided in the related sections.

  • Model selection: time series model are usually built with a pre-defined set of potential confounders. However, some criteria are needed to select other model parameters, such as the degree of control for seasonal and long time trends, or the adequacy of assumptions on the shape of the exposure-response relationship of predictors showing potential non-linear effects. Some investigators have tested the comparative performance of selection criteria based on information criteria (Akaike, Bayesian or related), minimization of partial autocorrelation of residuals, (generalized) cross-validation and others. Further research is needed to produce robust and general selection criteria.
  • Smoothing methods: the specification of non-linear exposure-response relationship for predictors in the regression model is essential both to determine the association with the exposure of interest and to control for potential confounders. Smoothing techniques based on both parametric and non-parametric methods have been proposed in time series analysis. The former usually rely on regression splines within generalized linear models (GLM), while the latter are specified through smoothing or penalized splines within generalized additive models (GAM).
  • Distributed lag (non-linear) models: commonly the effect of an exposure is not limited to the day it occurs, but persists for further days or weeks. This introduces the additional problem of modelling the lag structure of the exposure-response relationship. This issue has been initially addressed by distributed lag models, which allows the linear effect of a single exposure event to be distributed over a specific period of time. More recently, this methodology has been generalized to non-linear exposure-response relationships through distributed lag non-linear models, a modelling framework which can flexibly describe simultaneously non-linear and delayed associations.
  • Harvesting effect (mortality displacement): this phenomenon arises when applying an ecological time series analysis to grouped data, for example mortality counts. The conceptual framework is based on the assumption that the exposure can affects mainly a pool of frail individuals, whose events are only brought forward by a brief period of time by the effect of exposure. For non-recurrent outcomes, the depletion of the pool following a high exposure event results in some reduction of cases few days later, thereby reducing the overall long-term impact (see figure below). Specific models are needed to account for this reduction in the overall effect and thereby produce accurate estimates.
CSM conceptual framework for harvesting effect
  • Two-stage analysis: the usual approach to time series studies on environmental factors involves the analysis of series from multiple cities or regions. The complexity of the regression models prevents the specification of a very highly parameterized hierarchical structure in a single multilevel development. The analysis is instead carried out through a two-stage step, with a common city-specific model and then a meta-analysis to pool the results. The specification of complex exposure-response relationships in the first stage requires the development of non-standard meta-analytic techniques, such as meta-smoothing and multivariate meta-analysis.
  • Time-varying confounders: Whilst interrupted time series designs are rarely affected by normal confounders, such as differences in socio-economic status or age composition, which typically only change relatively slowly over time, they may be affected by time varying confounders. This is particularly an issue if the confounders are unmeasured and change over the same period as the intervention, for example other concurrent events or policies. Design adaptations may be introduced to address this limitation such as the introduction of a control series, multiple baseline designs (where the intervention is introduced in different locations at different times) and multiple phases (where the intervention is first introduced then removed to test whether the effect is reversed).

4. Contributions of LSHTM researchers

4.1 Methodological innovations:

Statisticians at the LSHTM have made contributions to time series regression methodology either in in explicitly methodological papers or in innovations published in reports of substantive epidemiological studies. A tutorial paper by several LSHTM researchers provides an overview of methods(Bhaskaran et al. 2013). Another paper summarizes several issues as potential candidates for methodological work, focusing in particular on temperature-health associations (Gasparrini and Armstrong 2010).

Distributed lag non-linear models. Several published methodological articles have proposed a new more flexible way to model lagged relationships, through the framework of distributed lag non-linear models (Armstrong 2006Gasparrini et al. 2010), implemented in the R package dlnm (Gasparrini 2011) – see figure. Later work presents methods for estimating attributable number of deaths from distributed lag non-linear models (Gasparrini and Leone 2014Gasparrini et al. 2015).

Two stage analyses. Other methodological efforts have explored ways to pool and explore heterogeneity in estimates of non-linear exposure-response relationships in two-stage analyses. The methods are based on multivariate meta-analytical techniques applied to estimates of multi-parameter associations from first-stage models, and implemented in the R package mvmeta (Gasparrini et al. 20112012a) – see figure.

Modifiers of exposure-response associations. The modification of associations by area characteristics has been investigated as meta-regression using the two-stage approach discussed above (Tobías et al. 2014). Modification by individual characteristics (age, SES) varying within areas has been explored using interaction terms in simple models (Hajat et al. 2007). We have also proposed a version of the “case-only” approach designed originally for studying gene-environment interactions in time series context, to study how effects of time-varying risk factors (e.g. weather) might be modified by time-fixed factors, such as age or socio-economic status (Armstrong 2003).

Heat waves. Two other papers have explored models to allow estimation of the extent to which the excess deaths associated with heat waves can be explained by a continuous association between temperature and mortality, or whether rather an additional “wave effect” due to sustained heat is necessary (Gasparrini and Armstrong 2011Hajat et al. 2006). Other work has developed ways to compare how performance of heat-health warning systems depend on heat wave definitions (Gasparrini et al. 2010).Mortality displacement (harvesting). Approaches have also been developed and applied to identify extent of short term “harvesting” (see above) (Hajat et al. 2005Rehill et al. 2015).

Interrupted time series (ITS). A tutorial paper introduces the ARIMA and segmented regression approaches to ITS (Lagarde 2011). Other applied research has introduced methods innovations. A paper has evaluated the influence of alternative modelling assumptions on the estimate of the association between the introduction of state-wide smoking bans and the incidence of acute myocardial infarction (Gasparrini et al. 2009). Other papers have pioneered the use of conditional Poisson models for multiple ITS designs (Grundy et al. 2009) and methods for controlled ITS (Milojevic et al. 2012).

Other recent and ongoing methodological work. A recent paper introduced the use of conditional Poisson models for the case-crossover and related formulations of time series (Armstrong et al. 2014). Methodological work continues, focused in particular on extending ways of characterising variation in distributed lag non-linear models across cities or sub-populations, and on adapting time series regression methods for infectious diseases (Imai et al. 2014)

4.2 Applied research:

The substantive research using time series regression methods carried out at the LSHTM or in which LSHTM researcher have collaborated has concerned mainly the associations between daily occurrences of health outcomes (such as deaths) and time-varying environmental factors. Earliest examples (Gouveia and Fletcher 2000) concerned associations of daily air pollution on mortality, and this interest continues (Milojevic et al. 2014Pattenden et al. 2010). But most focus has been on associations of weather and season with health – of particular interest in the context of impending global warming. The most common health outcome has been mortality (Armstrong et al. 2010Gasparrini et al. 2012bHajat et al. 2002McMichael et al. 2008) but also: hospital admissions (Pudpong and Hajat 2011), GP visits (Hajat and Haines 2002), viral disease (Lopman et al. 2009), food-borne disease (Kovats et al. 2004Tam et al. 2006), diarrhoea (Hashizume et al. 20072008aHashizume et al. 2010), pregnancy outcome (Lee et al. 2008Wolf and Armstrong 2012), myocardial infarctions (Bhaskaran et al 2010Bhaskaran et al. 2011Bhaskaran et al. 2012)); defibrilator activation (McGuinn et al. 2012).

Several studies have focused in particular on which groups are vulnerable to the acute effects identified in time series regression, in particular of weather (Hajat et al. 2007Hajat and Kosatky 2010Hashizume et al. 2008bWilkinson et al. 2004), but also those of limited daylight on injuries (Steinbach et al. 2014) . Others have predicted, from time series regressions, impact of climate change on deaths due to acute effects of heat and cold (Hajat et al. 2014Vardoulakis et al. 2014) .

Time series regression methods have also been used to study association of circulating RSV and influenza with hospital admission (Mangtani et al. 2006) and how much vaccination reduces that association with mortality (Armstrong et al. 2004).

Studies applying interrupted time series methods include those exploring the association of the introduction of state-wide smoking bans with the cardiovascular morbidity (Barone-Adesi et al. 2011), the financial crisis with suicides rates in Spain (Lopez Bernal et al. 2013), 20 mph speed limits with road injuries (Grundy et al. 2009), and floods with mortality (Milojevic et al. 2011Milojevic et al. 2012).

For other and in particular more recent relevant papers check out the personal web pages of the staff members, accessible from the list below.

5. LSHTM researchers involved in developing or using time series regression methodology

Antonio GasparriniBen ArmstrongClarence TamJamie Lopez BernalKatherine ArbuthnottKrishnan BhaskaranMike KenwardMylene LagardePaul WilkinsonPunam MangtaniRebecca SteinbachSam PattendenSari KovatsShakoor Hajat ; Zaid Chalabi

6. Publications by LSHTM researchers

Armstrong B. 2006. Models for the relationship between ambient temperature and daily mortality. Epidemiology 17:624-631.

Armstrong B, Chalabi Z, Fenn B, Hajat S, Kovats RS, Milojevic A, et al. 2010. The association of mortality with high temperatures in a temperate climate: England and wales. J Epidemiol Community Health [Epub ahead of print].

Armstrong BG. 2003. Fixed factors that modify the effects of time-varying factors: Applying the case-only approach. Epidemiology 14:467-472.

Armstrong BG, Mangtani P, Fletcher A, Kovats S, McMichael A, Pattenden S, et al. 2004. Effect of influenza vaccination on excess deaths occurring during periods of high circulation of influenza: Cohort study in elderly people. BMJ 329:660.

Armstrong BG, Gasparrini A, Tobias A. 2014. Conditional poisson models: A flexible alternative to conditional logistic case cross-over analysis. BMC medical research methodology 14:122.

Barone-Adesi F, Gasparrini A, Vizzini L, Merletti F, Richiardi L. 2011. Effects of italian smoking regulation on rates of hospital admission for acute coronary events: A country-wide study. PLoS One 6:e17419.

*Bhaskaran K, Hajat S, Haines A, Herrett E, Wilkinson P, Smeeth L. 2010. Short term effects of temperature on risk of myocardial infarction in england and wales: Time series regression analysis of the myocardial ischaemia national audit project (minap) registry. British Medical Journal 341:c3823.

*Bhaskaran K, Hajat S, Armstrong B, Haines A, Herrett E, Wilkinson P, et al. 2011. The effects of hourly differences in air pollution on the risk of myocardial infarction: Case crossover analysis of the minap database. BMJ 343:d5531.

*Bhaskaran K, Armstrong B, Hajat S, Haines A, Wilkinson P, Smeeth L. 2012. Heat and risk of myocardial infarction: Hourly level case-crossover analysis of minap database. BMJ 345:e8050.

*Bhaskaran K, Gasparrini A, Hajat S, Smeeth L, Armstrong B. 2013. Time series regression studies in environmental epidemiology. International journal of epidemiology 42:1187-1195.

Gasparrini A, Gorini G, Barchielli A. 2009. On the relationship between smoking bans and incidence of acute myocardial infarction. European Journal of Epidemiology 24:597-602.

*Gasparrini A, Armstrong B. 2010. Time series analysis on the health effects of temperature: Advancements and limitations. Environmental Research.

*Gasparrini A, Armstrong B, Kenward M. 2010. Distributed lag non-linear models. Statistics in Medicine 29(21): 2224-34.

*Gasparrini A. 2011. Distributed lag linear and non-linear models in r: The package dlnm. Journal of Statistical Software 43:1-20.

*Gasparrini A, Armstrong B. 2011. The impact of heat waves on mortality. Epidemiology 22:68.

*Gasparrini A, Armstrong B, Kenward MG. 2011. Multivariate meta-analysis: A method to summarize non-linear associations. Statistics in Medicine 30:2504–-2506.

*Gasparrini A, Armstrong B, Kenward MG. 2012a. Multivariate meta-analysis for non-linear and other multi-parameter associations. Statistics in Medicine 31:3821-3839.

*Gasparrini A, Armstrong B, Kovats S, Wilkinson P. 2012b. The effect of high temperatures on cause-specific mortality in england and wales. Occupational and Environmental Medicine 69:56-61.

Gasparrini A, Leone M. 2014. Attributable risk from distributed lag models. BMC Medical Research Methodology 14:55.

Gasparrini A, Guo Y, Hashizume M, Lavigne E, Zanobetti A, Schwartz J, et al. 2015. Mortality risk attributable to high and low ambient temperature: A multicountry observational study. The Lancet. In Press

*Gouveia N, Fletcher T. 2000. Time series analysis of air pollution and mortality: Effects by cause, age and socioeconomic status. Journal of epidemiology and community health 54:750.

Grundy C, Steinbach R, Edwards P, Green J, Armstrong B, Wilkinson P. 2009. Effect of 20 mph traffic speed zones on road injuries in london, 1986-2006: Controlled interrupted time series analysis. BMJ 339:b4469.

Hajat S, Haines A. 2002. Associations of cold temperatures with gp consultations for respiratory and cardiovascular disease amongst the elderly in london. Int J Epidemiol 31:825-830.

Hajat S, Kovats RS, Atkinson RW, Haines A. 2002. Impact of hot temperatures on death in london: A time series approach. J Epidemiol Community Health 56:367-372.

Hajat S, Armstrong BG, Gouveia N, Wilkinson P. 2005. Mortality displacement of heat-related deaths: A comparison of delhi, sao paulo, and london. Epidemiology 16:613-620.

Hajat S, Armstrong B, Baccini M, Biggeri A, Bisanti L, Russo A, et al. 2006. Impact of high temperatures on mortality: Is there an added heat wave effect? Epidemiology 17:632-638.

Hajat S, Kovats RS, Lachowycz K. 2007. Heat-related and cold-related deaths in england and wales: Who is at risk? Occup Environ Med 64:93-100.

Hajat S, Kosatky T. 2010. Heat-related mortality: A review and exploration of heterogeneity. Journal of Epidemiology and Community Health 64:753-760.

Hajat S, Vardoulakis S, Heaviside C, Eggen B. 2014. Climate change effects on human health: Projections of temperature-related mortality for the uk during the 2020s, 2050s and 2080s. Journal of epidemiology and community health 68:641-648.

*Hashizume M, Armstrong B, Hajat S, Wagatsuma Y, Faruque AS, Hayashi T, et al. 2007. Association between climate variability and hospital visits for non-cholera diarrhoea in bangladesh: Effects and vulnerable groups. Int J Epidemiol 36:1030-1037.

*Hashizume M, Armstrong B, Hajat S, Wagatsuma Y, Faruque AS, Hayashi T, et al. 2008a. The effect of rainfall on the incidence of cholera in bangladesh. Epidemiology 19:103-110.

*Hashizume M, Wagatsuma Y, Faruque AS, Hayashi T, Hunter PR, Armstrong B, et al. 2008b. Factors determining vulnerability to diarrhoea during and after severe floods in bangladesh. J Water Health 6:323-332.

*Hashizume M, Faruque ASG, Wagatsuma Y, Hayashi T, Armstrong B. 2010. Cholera in bangladesh: Climatic components of seasonal variation. Epidemiology 21:706-710.

Imai C, Armstrong B, Chalabi Z, Hashizume M, Mangtani P. 2014. Application of traditional time- series regression models for study of environmental determinants of infectious diseases. In: ISEE.

Kovats RS, Edwards SJ, Hajat S, Armstrong BG, Ebi KL, Menne B. 2004. The effect of temperature on food poisoning: A time-series analysis of salmonellosis in ten european countries. Epidemiol Infect 132:443-453.

Lagarde M. 2011. How to do (or not to do)… assessing the impact of a policy change with routine longitudinal data. Health policy and planning:czr004.

*Lee SJ, Hajat S, Steer PJ, Filippi V. 2008. A time-series analysis of any short-term effects of meteorological and air pollution factors on preterm births in london, uk. Environ Res 106:185-194.

*Lopez Bernal JA, Gasparrini A, Artundo CM, McKee M. 2013. The effect of the late 2000s financial crisis on suicides in spain: An interrupted time-series analysis. European Journal of Public Health 23:732-736.

Lopman B, Armstrong B, Atchison C, Gray JJ. 2009. Host, weather and virological factors drive norovirus epidemiology: Time-series analysis of laboratory surveillance data in england and wales. PLoS One 4:e6671.

Mangtani P, Hajat S, Kovats S, Wilkinson P, Armstrong B. 2006. The association of respiratory syncytial virus infection and influenza with emergency admissions for respiratory disease in london: An analysis of routine surveillance data. Clin Infect Dis 42:640-646.

*McGuinn L, Hajat S, Wilkinson P, Armstrong B, Anderson HR, Monk V, et al. 2012. Ambient temperature and activation of implantable cardioverter defibrillators. Int J Biometeorol.

McMichael AJ, Wilkinson P, Kovats RS, Pattenden S, Hajat S, Armstrong B, et al. 2008. International study of temperature, heat and urban mortality: The ‘isothurm’ project. Int J Epidemiol.

Milojevic A, Armstrong B, Kovats S, Butler B, Hayes E, Leonardi G, et al. 2011. Long-term effects of flooding on mortality in england and wales, 1994-2005: Controlled interrupted time-series analysis. Environ Health 10:11.

Milojevic A, Armstrong B, Hashizume M, McAllister K, Faruque A, Yunus M, et al. 2012. Health effects of flooding in rural bangladesh. Epidemiology 23:107-115.

Milojevic A, Wilkinson P, Armstrong B, Bhaskaran K, Smeeth L, Hajat S. 2014. Short-term effects of air pollution on a range of cardiovascular events in england and wales: Case-crossover analysis of the minap database, hospital admissions and mortality. Heart:heartjnl-2013-304963.

Pattenden S, Armstrong B, Milojevic A, Barratt B, Chalabi Z, Doherty R, et al. 2010. Ozone, heat and mortality in fifteen british conurbations. Occup Environ Med.

*Pudpong N, Hajat S. 2011. High temperature effects on out-patient visits and hospital admissions in chiang mai, thailand. Science of the Total Environment 409:5260-5267.

*Rehill N, Armstrong B, Wilkinson P. 2015. Clarifying life lost due to cold and heat: A new approach. BMJ Open In Press.

*Steinbach R, Edwards P, Green J, Armstrong B. 2014. The contribution of light levels to ethnic differences in child pedestrian injury risk: A case-only analysis. Journal of Transport & Health 1:33-39.

*Tam CC, Rodrigues LC, O’Brien SJ, Hajat S. 2006. Temperature dependence of reported campylobacter infection in england, 1989-1999. Epidemiol Infect 134:119-125.

Tobías A, Armstrong B, Gasparrini A, Diaz J. 2014. Effects of high summer temperatures on mortality in 50 spanish cities. Environmental Health 13:48.

Vardoulakis S, Dear K, Hajat S, Heaviside C, Eggen B. 2014. Comparative assessment of the effects of climate change on heat- and cold-related mortality in the united kingdom and australia. Environmental Health Perspectives 122:1285–-1292.

Wilkinson P, Pattenden S, Armstrong B, Fletcher A, Kovats RS, Mangtani P, et al. 2004. Vulnerability to winter mortality in elderly people in britain: Population based study. Bmj 329:647.

*Wolf J, Armstrong B. 2012. The association of season and temperature with adverse pregnancy outcome in two german states, a time-series analysis. PLoS One 7:e40228.

* Research undertaken while the first author was a student at the LSHTM.

Last updated 1 April 2015. For more up to date publications refer to researchers’ personal web pages

7. Key references on methods


Bhaskaran K, Gasparrini A, Hajat S, Smeeth L, Armstrong B. Time series regression studies in environmental epidemiologyInternational Journal of Epidemiology. 2013;42(4):1187-1195

Peng, R. D. and F. Dominici (2008). Statistical Methods for Environmental Epidemiology with R – A Case Study in Air Pollutioon and Health. New York, Springer.

Zeger, S. L., R. Irizarry and R. D. Peng (2006). On time series analysis of public health and biomedical data. Annual Review of Public Health 27: 57-79.

Armstrong, B. (2006). Models for the relationship between ambient temperature and daily mortality. Epidemiology 17(6): 624-31.

Dominici, F. (2004). Time-series analysis of air pollution and mortality: a statistical review. Research report – Health Effects Institute 123: 3-27; discussion 9-33.

Dominici, F., A. McDermott and T. J. Hastie (2004). Improved semiparametric time series models of air pollution and mortality. Journal of the American Statistical Association 99(468): 938-49.

Touloumi, G., R. Atkinson, A. Le Tertre, et al. (2004). Analysis of health outcome time series data in epidemiological studies. EnvironMetrics 15(2): 101-17.

On model selection

Dominici, F., C. Wang, C. Crainiceanu, et al. (2008). Model selection and health effect estimation in environmental epidemiology. Epidemiology 19(4): 558-60.

Crainiceanu, C. M., F. Dominici and G. Parmigiani (2008). Adjustment uncertainty in effect estimation. Biometrika 95(3): 635.

Baccini, M., A. Biggeri, C. Lagazio, et al. (2007). Parametric and semi-parametric approaches in the analysis of short-term effects of air pollution on health. Computational Statistics and Data Analysis 51(9): 4324-36.

He, S., S. Mazumdar and V. C. Arena (2006). A comparative study of the use of GAM and GLM in air pollution research. EnvironMetrics 17(1): 81-93.

Peng, R. D., F. Dominici and T. A. Louis (2006). Model choice in time series studies of air pollution and mortality. Journal of the Royal Statistical Society: Series A 169(2): 179-203.

On smoothing methods

Marra, G. and R. Radice (2010). Penalised regression splines: theory and application to medical research. Statistical Methods in Medical Research 19(2): 107-25.

Schimek, M. G. (2009). Semiparametric penalized generalized additive models for environmental research and epidemiology. EnvironMetrics 20(6): 699-717.

Wood, S. N. (2006). Generalized Additive Models: an Introduction with R, Chapman \& Hall/CRC.

Dominici, F., M. J. Daniels, S. L. Zeger, et al. (2002a). Air pollution and mortality: estimating regional and national dose-response relationships. Journal of the American Statistical Association 97: 100-11.

Dominici, F., A. McDermott, S. L. Zeger, et al. (2002b). On the use of generalized additive models in time-series studies of air pollution and health. American Journal of Epidemiology 156(3): 193-203.

On harvesting effect

Rabl, A. (2005). Air pollution mortality: harvesting and loss of life expectancy. Journal of Toxicology and Environmental Health: Part A 68(13-14): 1175-80.

Schwartz, J. (2001). Is there harvesting in the association of airborne particles with daily deaths and hospital admissions? Epidemiology 12(1): 55-61.

Schwartz, J. (2000b). Harvesting and long term exposure effects in the relation between air pollution and mortality. American Journal of Epidemiology 151(5): 440-8.

On distributed lag (non-linear) models

Gasparrini A. Modeling exposure-lag-response associations with distributed lag non-linear modelsStatistics in Medicine. 2014;33(5):881-899

Gasparrini, A. (2011). Distributed Lag Linear and Non-Linear Models in R: The Package dlnm. J Stat Softw 43(8): 1-20.

Gasparrini, A., B. Armstrong and M. G. Kenward (2010). Distributed lag non-linear models. Statistics in Medicine 29(21): 2224-34.

Muggeo, V. M. (2008). Modeling temperature effects on mortality: multiple segmented relationships with common break points. Biostatistics 9(4): 613-20.

Schwartz, J. (2000a). The distributed lag between air pollution and daily deaths. Epidemiology 11(3): 320-6.

On meta-analytic techniques

Gasparrini, A., B. Armstrong, et al. (2012). Multivariate meta-analysis for non-linear and other multi-parameter associations. Statistics in Medicine. 31:3821-3839.

Dominici, F., J. M. Samet and S. L. Zeger (2000). Combining evidence on air pollution and daily mortality from the 20 largest US cities: a hierarchical modelling strategy. Journal of the Royal Statistical Society: Series A 163(3): 263-302.

Schwartz, J. and A. Zanobetti (2000). Using meta-smoothing to estimate dose-response trends across multiple studies, with application to air pollution and daily death. Epidemiology 11(6): 666-72.

On interrupted time series

Wagner, A. K., S. B. Soumerai, F. Zhang, et al. (2002). Segmented regression analysis of interrupted time series studies in medication use research. Journal of Clinical Pharmacy and Therapeutics 27(4): 299-309.

Shadish WR, Cook TD, et al. (2002) Experimental and quasi-experimental designs for generalized causal inference