 Analysis of Clinical Trials

Theme Coordinators: Neal Alexander, Baptiste Leurent, Clemence Leyrat, Amy Mulick, Stephen Nash, Linda Sharples
Please see here for slides and audio recordings of previous seminars relating to this theme.
Randomised controlled trials (RCTs) are one of the most important tools to estimate effects of medical interventions. However, there are a number of issues relating to the statistical analysis of RCTs over which there is much debate. We aim to consider and raise discussion of a number of these issues, and suggest possible statistical approaches for dealing with these in the analyses of RCTs. We briefly introduce here some of these issues.
Topics:
 Cluster randomised trials
 Covariate adjustment
 Subgroup analysis
 Missing data
 Noncompliance
 Sequential trials
 Good practice in trials
Cluster randomised trials
In cluster randomised trials (CRTs), groups of participants, rather than the participants themselves, are randomised to intervention groups^{1}. This design is increasingly used to assess complex interventions, in particular communitylevel interventions. Many CRTs have been conducted at LSHTM to evaluate the effectiveness of healthcare interventions, motivating a wide range of methodological research on their design^{2,3}, analysis^{4,5} and reporting^{6} to take into account the specificities of such trials. One such challenge in CRTs arises because participants’ outcome measures are not independent and thus clustering must be accounted for in the analysis. For this reason, CRTs are also a main research area within the Design and Analysis for Dependent Data CSM theme.
Other issues arising from CRTs have been recently studied at the CSM, such as the risk of systematic baseline imbalance^{7}, analysis in the presence of missing data^{8}, and analysis of alternative cluster designs such as cluster crossover trials^{9}. Ongoing projects include research into spatial analysis of CRTs, the analysis strategy when only a small number of clusters are randomised, and the estimate of causal effects in CRTs with noncompliance.
 Hayes RJ, Moulton LH. Cluster Randomised Trials. Taylor & Francis; 2009. 338 p.
 Hayes RJ, Alexander ND, Bennett S, Cousens SN. Design and analysis issues in clusterrandomized trials of interventions against infectious diseases. Stat Methods Med Res. 2000 Apr;9(2):95–116.
 Thomson A, Hayes R, Cousens S. Measures of betweencluster variability in cluster randomized trials with binary outcomes. Stat Med. 2009 May 30;28(12):1739–51.
 Gomes M, DíazOrdaz K, Grieve R, Kenward MG. Multiple imputation methods for handling missing data in costeffectiveness analyses that use data from hierarchical studies: an application to cluster randomized trials. Med Decis Mak Int J Soc Med Decis Mak. 2013 Nov;33(8):1051–63.
 Alexander N, Emerson P. Analysis of incidence rates in clusterrandomized trials of interventions against recurrent infections, with an application to trachoma. Stat Med. 2005 Sep 15;24(17):2637–47.
 Campbell MK, Piaggio G, Elbourne DR, Altman DG, for the CONSORT Group. Consort 2010 statement: extension to cluster randomised trials. BMJ. 2012 Sep 4;345(sep04 1):e5661–e5661.
 Leyrat C, Caille A, Foucher Y, Giraudeau B. Propensity score to detect baseline imbalance in cluster randomized trials: the role of the cstatistic. BMC Med Res Methodol. 2015;
 DiazOrdaz K, Kenward MG, Gomes M, Grieve R. Multiple imputation methods for bivariate outcomes in cluster randomised trials. Stat Med. 2016 Sep 10;35(20):3482–96.
 Morgan KE, Forbes AB, Keogh RH, Jairath V, Kahan BC. Choosing appropriate analysis methods for cluster randomised crossover trials with a binary outcome. Stat Med. 2016 Sep 28;
 Bayesian Statistics

 Big Data and Machine Learning

Theme Coordinators: Elizabeth Williamson, Nuno Sepulveda, Jan van der Meulen, Luigi Palla
Please see here for slides and audio recordings of previous seminars relating to this theme.
Contents:
 Big data – a quick overview
 Some key methodological issues
 Some areas of application
 Events
In recent years, big data has become the new hype in biomedical research due to great achievements in technological development. Benchmark examples of big data are (i) online tracking of flu epidemics, (ii) the genetic and genomic analyses of many human diseases, and (iii) the analysis of millions of health and hospital records. However, the explosion of big data applications has brought with it interesting methodological questions, such as how to best store, manage, analyse and integrate such everincreasing data.
This theme aims to provide a sharing space for methodological development on big data problems and its dissemination across the LSHTM research community.
Some of the key methodological issues that members of our theme are working on are:
 Methods for assessing and improving data quality
 Missing and poorly measured data
 Data linkage
 Data mining, multivariate statistics
 Causal inference for big data
 Phenotyping
 Stochastic models for high throughput technologies
 Geomapping
 Prediction
 Machine learning
Some areas of application using big data within the school are:
 Environmental epidemiology
 Health service evaluation
 Health economics studies
 Pharmacoepidemiology
 Nutritional epidemiology
 Genomic epidemiology
 ‘Omics integration and systems biology
 Seroepidemiology of infectious disease
 Analysis of microbiome
Statistical/methodological issues
Methods for assessing and improving data quality
The 3 V’s for big data – volume, variety, and velocity – were quickly succeeded by the 4 V’s, adding in “veracity”. It was quickly recognised that the most sophisticated big data analytics could not overcome limitations of poorly captured data. Increasing electronic capture and storage of information does not, unfortunately, guarantee good data quality.
There is a relative paucity of methodological work to assess and improve the quality of data in big data settings. However, better detection of errors, leading to enhanced chances of correcting erroneous data, is essential for the validity of subsequent analysis.
Various approaches for detecting likely errors in data have been proposed. In the context of longitudinal data within routinely collected primary care data, one promising method developed in collaboration with members of our theme uses an iterative approach of fitting mixed models, identifying likely outliers, and refitting the model after removal of outliers (Welch et al, 2012).
Some relevant references:
Welch C, Petersen I, Walters K, Morris RW, Nazareth I, Kalaitzaki E, White IR, Marston L, Carpenter J. Twostage method to remove population and individuallevel outliers from longitudinal data in a primary care database. Pharmacoepidemiology and Drug Safety, 2012; 21: 725732.
Missing data
The challenge of missing data is not restricted to the context of large datasets of routinely or semiautomatically collected data. However, missing data in such settings raises complex and often novel challenges; we highlight two below.
The first is referred to as ‘data dependent sampling’ – in other words the process you are trying to collect data on controls – to some extent – the data you are able to collect. To give two examples:
 using wearable devices to measure activity can overestimate usual activity due to participants choosing to leave their device at home on low activity days; this is a form of measurement error
 in routinely collected primary care data, clinical and therapeutic information is collected only when the patient chooses to visit their general practitioner – and then only for reasons specifically relevant to the consultation.
The second challenge arises because of the sheer volume of the data. While imputation and related approaches are a flexible and powerful approach and offer much potential, they must be adapted to meet these challenges, which can violate their underpinning assumptions.
Some recent work within the group that has addressed some of these challenges:
 twofold imputation, an adaption of multiple imputation which attempts to simplify the problem by conditioning only on measurements which are local in time (Welch et al, 2014); and
 a paper correcting misconceptions on the use of multiple imputation to handle missing data in propensity score analyses (Leyrat et al, 2017)
Some relevant references:
Welch C, Petersen I, Bartlett J, White IR, Marston L, Morris RW, Nazareth I, Walters K, Carpenter J. Evaluation of twofold fully conditional specification multiple imputation for longitudinal electronic health record data. Statist. Med. 2014, 33:3725–3737.
Leyrat C, Seaman SR, White IR, Douglas I, Smeeth L, Kim J, RescheRigon M, Carpenter JR, Williamson EK. Propensity score analysis with partially observed covariates: How should multiple imputation be used? Stat Methods Med Research, 2017, doi: 10.1177/0962280217713032. [Epub ahead of print]
Causal inference
Assessing causal relationships from nonrandomised data poses many methodological challenges, particularly related to confounding and selection bias. These are exacerbated in studies conducted using routinely collected data: data not collected for the primary purpose of research tend to be less regular and less complete than traditional data sources used to address such questions.
In comparative effectiveness studies of medications, there is often a wealth of information available regarding previous diagnoses, medications, referrals and therapies. However, how best to incorporate this information into analyses remains unclear. The highdimensional propensity score (Schneeweiss et al, 2009) is an empirical algorithm to select potential confounders, prioritise candidates, and incorporate selected variables into a propensityscore based statistical model. This algorithm was developed in the context of US claims data; the validity of its application to different settings, such as routinely collected primary care data in the UK, remains unclear.
An alternative approach to the incorporation of a large number of potential confounders into a causal model is offered by Targeted Maximum Likelihood Estimation (TMLE). This approach has been applied to UK primary care data to investigate the association between statins and allcause mortality (Pang et al, 2016), with the authors concluding that a deeper understanding of the comparative advantages and disadvantages of this approach was needed within this bigdata setting.
To begin to address this knowledge gap, members of our theme have developed a free open source online tutorial to introduce TMLE for Causal Inference, which can be found here.
They have also created and made available a free open source Stata program to implement doublerobust methods for causal inference, including Machine Learning algorithms for prediction (see links below).
A promising approach to poorly measured, or unmeasured, confounding is offered by selfcontrolled designs. The selfcontrol risk interval, casecrossover and selfcontrolled case series, for example, the selfcontrolled case series uses individuals as their own control, thus removing timeinvariant confounders.
Some relevant references:
Franklin JM, Schneeweiss S, Solomon DH. Assessment of Confounders in Comparative Effectiveness Studies From Secondary Databases. Am J Epidemiol. 2017; 185(6): 474478. doi: 10.1093/aje/kww136.
Franklin JM, Eddings W, Austin PC, Stuart EA, Schneeweiss S. Comparing the performance of propensity score methods in healthcare database studies with rare outcomes. Stat Med. 2017; 36(12): 19461963. doi: 10.1002/sim.7250.
Schneeweiss S, Rassen JA, Glynn RJ, Avorn J, Mogun H, Brookhart MA. Highdimensional propensity score adjustment in studies of treatment effects using health care claims data. Epidemiology. 2009; 20(4): 512522.
Pang M, Schuster T, Filion KB, Eberg M, Platt RW. Targeted maximum likelihood estimation for pharmacoepidemiologic research. Epidemiology. 2016; 27(4): 570577.
Kang JD, Schafer JL. Demystifying double robustness: A comparison of alternative strategies for estimating a population mean from incomplete data. Statistical Science. 2007: 52339.
Schuler MS, Rose S. Targeted Maximum Likelihood Estimation for Causal Inference in Observational Studies. American Journal of Epidemiology. 2016. doi: 10.1093/aje/kww165.
S Gruber and MJ van der Laan. tmle: An R Package for Targeted Maximum Likelihood Estimation. Journal of Statistical Software. 2012; 51(13).
Gruber. Targeted Learning in Healthcare Research. Big Data. 2016; 3(4), 211218. DOI:10.1089/big.2015.0025.
Software (open source):
Author: Dr. Miguel Angel LuqueFernandez, LSHTM.
https://github.com/migariane/meltmle
https://github.com/migariane/weltmleOnline tutorial:
Author: Dr. Miguel Angel LuqueFernandez, LSHTM.
https://migariane.github.io/TMLE.nb.htmlLinkage
Linkage has been described as “a merging that brings together information from two or more sources of data with the object of consolidating facts concerning an individual or an event that are not available in any separate record” (Organisation for Economic Cooperation and Development (OECD) Glossary of Statistical Terms).
Linking health related datasets offers the opportunity to improve data quality, by improving ascertainment of key riskfactors and outcomes, allowing inconsistencies to be identified and resolved. It is a costeffective means of assembling a dataset, exploiting existing resources. However, challenges associated with data linkage include the lack of unique identifiers for linkage, leading to possible errors in the linkage, and data security considerations.
Small amounts of linkage error can result in substantially biased results. False matches introduce variability and weaken the association between variables, often resulting in bias to the null, and missed matches reduce the sample size and result in a loss of statistical power and potential selection bias. Evaluating the potential impact of linkage error on results is vital (Harron et al, 2014).
Some relevant references:
Harron K, Wade A, Gilbert R, MullerPebody B, Goldstein H. Evaluating bias due to data linkage error in electronic healthcare records. BMC Medical Research Methodology, 2014, 14: 36. DOI: 10.1186/147122881436.
Topics
Environmental epidemiology
In recent decades, the research community has made important steps forward in understanding the relationship between exposure to environmental factors and human health. Big data technologies offer the opportunity to extend this research further, for instance by making available highresolution exposure data from remote sensing tools and realtime measurement from smartphone mobile applications, and by linking electronic health records including large collections of variables on health data and personal characteristics. However, this new setting requires the development of novel analytical methods for handling complex data structures and for modelling individual risk profiles with longitudinal measures on timevarying exposures, health outcomes and susceptibility factors. This new ‘big data’ framework can improve the analytical capability of environmental health studies and extend our knowledge on the complex pathways linking exposures to environmental stressors and human health.
Some relevant references:
Gasparrini, A et al. Mortality risk attributable to high and low ambient temperature: a multicountry observational study. The Lancet, Volume 386, Issue 9991, 369 – 375.
Di Q, Wang Y et al. Air pollution and mortality in the Medicare population. New Eng J Med. 2017, Volume 376, No 26, 251322.
Pharmacoepidemiology
Largescale routinely collected health data provide unprecedented potential for populationbased health research. The advantages of using these data include the low cost and timeliness of the research, greatly increased population coverage, and increased statistical power. The US Food and Drug Administration and European Medicines Agency now mandate the use of “real world evidence” of medication effects in drug licensing; in practice such real world evidence often comes from studies incorporating routinely collected health data.
Use of routinely collected data to establish causal relationships, however, raises a number of challenges. Information bias, due to low quality or missing information, remains an issue despite financial incentive schemes aimed at improving data quality such as the Quality Outcomes Framework in the UK. Linkage between data sources improves capture of key outcomes and exposures, but brings additional potential sources of bias. Confounding by poorly measured or unmeasured factors complicates the comparison between groups prescribed different medications. Related to this, the very reasons drugs are prescribed are often highly correlated with the outcomes we wish to study, and this “confounding by indication” remains a key challenge in pharmacoepidemiology.
Despite these challenges, there have been notable successes. For example, members of our theme used linked primary and secondary care data to replicate well known results from randomised trials regarding the effect of statins on vascular outcomes. The same study was used to demonstrate no association between statins and cancer; the absence of this link was later confirmed by randomised trials. Selfcontrolled designs have great potential to remove timeinvariant confounding, and have been successfully implemented by our group to investigate a wide range of associations between vaccines/drugs and adverse outcomes.
Some relevant references:
Smeeth L, Douglas I, Hall AJ, Hubbard R, Evans S. Effect of statins on a wide range of health outcomes: a cohort study validated by comparison with randomised trials. Br J Clin Pharmacol. 2009; 67(1): 99109.
Nutritional epidemiology
The effects of dietary intake (containing many different foods and nutrients) on health are complex. Understanding specific effects requires taking into account the interactions among dietary exposures which, as it is now recognised, should be analysed jointly to enucleate dietary patterns, in order to better summarise the effect of food intake on health. To this end, multivariate statistical methods like principal component, cluster and factor analysis are needed.
As food intake can be difficult to measure as an epidemiological exposure, attempts at measurement of dietary behaviour via (validated) metabolic biomarkers are generally underway, which has linked the field of nutritional epidemiology to the Omics field and the various methodological issues characterising chemometric data obtained via Nuclear Magnetic Resonance (NMR) and highthroughput Mass Spectrometry (MS).
Additional Big Data complexities arise in nutritional surveys conducted through dietary diaries as these record the eating occasions at different times for each individual involved, which results in very large number of observations that on the one hand can be used for data mining and hypothesis generation (e.g. in the context and timing of eating) through multivariate methods and on the other call for methodological developments to accommodate their complex hierarchical structure.
Some relevant references:
Gleason PM, Boushey CJ, Harris JE et al. Publishing nutrition research: A review of Multivariate Techniques Part 3: Data Reduction Methods, 2015, J Acad Nutr Diet. 2015;115:10721082.
Assi N, Moskat A, Slimani N et al. A treelet transform analysis to relate nutrient patterns to the risk of hormonal receptordefined breast cancer in the European Prospective Investigation into Cancer and Nutrition (EPIC). Public Health Nutr, 2016 Feb; 19(2): 2425.
Chapman A, Beh E, Palla L. Application of Correspondence Analysis to Graphically Investigate Associations Between Foods and Eating Locations Studies in Health Technology and Informatics, 2017; 235: 166170.
Events
On 14 March 2017 we held an introductory workshop to discuss shared methodological interests within Big Data across the school. Please see here for slides and audio recordings.
On 7 July 2017 we held a halfday symposium on “Statistical Methods for Big Data”. Please see here for the full details and here for slides and audio recordings.
Through the year, we will organise a series of workshops and seminars aimed at bringing together researchers encountering methodological challenges in analysing big data, and methodologists with interests in relevant areas.
 Causal Inference

Theme Coordinators: Simon Cousens, Karla DiazOrdaz, Richard Silverwood, Ruth Keogh, Stijn Vansteelandt.
Please see here for slides and audio recordings of previous seminars relating to this theme.
This field was recently criticised in a paper by Vandenbroucke et al. Members of this theme have written a response to these criticisms, which is available here.
Seminars
As part of the CSM’s activities, seminars on causal inference are often organised. Past speakers include Philip Dawid, Vanessa Didelez, Richard Emsley, Miguel Hernán, Erica Moodie and Anders Skrondal.
Details of upcoming seminars can be found here.
UK Causal Inference Meeting 2016
We were delighted to host – together with the London School of Economics and Political Science – the 4^{th} annual UK Causal Inference Meeting, which took place at LSHTM from 1315 April 2016. More details can be found here.
Other events
As well as regular research seminars, we occasionally organise oneoff events on topics of particular interest. For example, we are currently planning a halfday meeting on recent controversies in propensity scores. Details of upcoming events can be found here.
A brief overview of a vast and rapidlyexpanding subject
Causal inference is a central aim of many empirical investigations, and arguably most studies in the fields of medicine, epidemiology and public health. We would like to know ‘does this treatment work?’, ‘how harmful is this exposure?’, or ‘what would be the impact of this policy change?’.
The gold standard approach to answering such questions is to conduct a controlled experiment in which treatments/exposures are allocated at random, all participants adhere perfectly to the treatment assigned, and all the relevant data are collected and measured without error. Provided that we can then discount ‘chance’ alone as an explanation, any observed differences between treatment groups can be given a causal interpretation (albeit in a population that may differ from the one in which we are interested).
In the real world, however, such experiments rarely attain this ideal status, and for many important questions, such an experiment would not even be ethically, practically, or economically feasible, and our investigations must be based instead on observational data. In reality, therefore, causal inference is a very ambitious goal. However, since it undeniably is the only useful goal in so many contexts, we must try our best. This involves carefully formulating the causal question to be tackled, explicitly stating the assumptions under which the answers may be trusted, often considering novel analysis methods that may require weaker assumptions than would be required by traditional approaches, and finally using sensitivity analyses to explore how robust our conclusions are to violations of the assumptions.
Historically, even when attempting causal inference, the role of statistics was seen to be to quantify the extent to which ‘chance’ could explain the results, with concerns over systematic biases due to the nonideal nature of the data relegated to the qualitative discussion of the results. The field known as causal inference has changed this state of affairs, setting causal questions within a coherent framework which facilitates explicit statement of all the assumptions underlying the analysis and allows extensive exploration of potential biases. In the paragraphs that follow, we will attempt a brief overview.
A language for causal inference (potential outcomes and counterfactuals)
Over the last thirty years, a formal statistical language has been developed in which causal effects can be unambiguously defined, and the assumptions needed for their identification clearly stated. Although alternative frameworks have been suggested (see, for example, Dawid, 2000) and developed, the language which has gained most traction in the health sciences is that of potential outcomes, also called counterfactuals (Rubin, 1978).
Suppose X is a binary exposure, Y a binary outcome, and C a collection of potential confounders, measured before X. We write Y^{0} and Y^{1} for the two potential outcomes; the first is the outcome that would be seen if X were set (possibly counter to fact) to 0, and the second is what would be seen if X were set to 1. Causal effects can then be expressed as contrasts of aspects of the distribution of these potential outcomes. For example:
E(Y^{1}) –E(Y^{0}) E(Y^{1}X=1)/E(Y^{0}X=1) log[E(Y^{1}C)/{1–E(Y^{1}C)}] –log[E(Y^{0}C)/{1–E(Y^{0}C)}]Causal inference methods for economic evaluations of longitudinal interventions
Noemi Kreif (LSHTM)
Noemi holds a Medical Research Council Early Career Fellowship in the Economics of Health, on improving statistical methods to address confounding in the economic evaluation of health interventions, including a collaboration with Dr Maya Petersen at UC Berkeley Division of Biostatistics. She is investigating advanced causal inference methods for the setting of economic evaluations of longitudinal interventions. In particular, she is using targeted maximum likelihood estimation and machine learning to compare dynamic treatment regimes, using a nonrandomised study on nutritional intake of critically ill children.
The latter formalises the notion of “no unmeasured confounders”.
The increased clarity afforded by this language has led to increased awareness of causal pitfalls (such as the ‘birthweight paradox’ – see HernándezDíaz et al, 2006) and the building of a new and extensive toolbox of statistical methods especially designed for making causal inferences from nonideal data under transparent, less restrictive and more plausible assumptions than were hitherto required.
Of course this does not mean that all causal questions can be answered, but at least they can be formally formulated and the plausibility of the required assumptions assessed.
Considerations of causality are not new. Neyman used potential outcomes in his PhD thesis in the 1920s, and who could forget Bradford Hill’s muchcited guidelines published in 1965? The last few decades, however, have seen the focus move towards developing solutions, as well as acknowledging limitations.
Traditional methods
Not all reliable causal inference requires novel methodology. A carefullyconsidered regression model, with an appropriate set of potential confounders (possibly identified using a causal diagram – see below) measured and appropriately included as covariates, is a reasonable approach in some settings.
Causal diagrams
An ubiquitous feature of methods for estimating causal effects from nonideal data is the need for untestable assumptions regarding the causal structure of the variables being analysed (from which conditions such as conditional exchangeability can be deduced). Such assumptions are often represented in a causal diagram or graph, with variables identified by nodes and the relationships between them by edges. The simplest and most commonlyused class of causal diagram is the (causal) directed acyclic graph (DAG), in which all edges are arrows, and there are no cycles, i.e. no variable explains itself (Greenland et al, 1999). These are used not only to represent assumptions but also to inform the choice of a causallyinterpretable analysis, specifically to help decide whichvariables should be included as confounders.
Fullyparametric approaches to problems involving many variables
Another common feature of causal inference methods is that, as we move further from the ideal experimental setting, more aspects of the joint distribution of the variables must be modelled, which would have been ancillary had the data arisen from a perfect experiment. Structural equation modelling (SEM) (Kline, 2011) is a fullyparametric approach, in which the relationship between each node in the graph and its parents is specified parametrically. This approach offers an elegant (full likelihood) treatment of ignorable missing data and measurement error, when this affects any variable for which validation or replication data are available.
Semiparametric approaches
Concerns over the potential impact of model misspecification in fullyparametric approaches have led to the development of alternative semiparametric approaches to causal inference, in which the number of additional aspects to be modelled is reduced. These include methods based on the propensity score (Rosenbaum and Rubin, 1983), including inverse probability weighting, and gestimation, and the socalled doublyrobust estimation proposed by Robins, Rotnitzky and others.
Inferring the effects of timevarying exposures
Novel causal inference methods are particularly relevant for studying the causal effect of a timevarying exposure on an outcome, because standard methods fail to give causallyinterpretable estimators when there exist timevarying confounders of the exposure and outcome that are themselves affected by previous levels of the exposure. Methods developed to deal with this problem include the fullyparametric gcomputation formula (Robins, 1986), and two semiparametric approaches: gestimation of structural nested models (Robins et al, 1992), and inverse probability weighted estimation of marginal structural models (Robins et al, 2000). For an accessible tutorial on these methods, see Daniel et al (2013). Related to this longitudinal setting is the identification of optimal treatment regimes, for example in HIV/AIDS research where questions such as ‘at what level of CD4 should HAART (highly active antiretroviral therapy) be initiated?’ are often asked. These can be addressed using the methods listed above, and other related methods (see Chakraborty and Moodie, 2013).
Instrumental variables and Mendelian Randomisation
It is important to appreciate that nonideal experimental data (e.g. suffering from noncompliance, missing data or measurement error) are not on a par with data arising from observational studies (as may be inferred from what is written above). Randomisation can be used as a tool to aid causal inference even when the randomised experiment is ‘broken’, for example as a result of noncompliance to randomised treatment. Such methods make use of randomisation as an instrumental variable (Angrist and Pischke, 2009). Instrumental variables have even been used with observational data, in particular when the instrument is a variable that holds genetic information (in which case it is known as Mendelian randomisation; see Davey Smith and Ebrahim, 2003) with genotype used in place of randomisation. This is motivated by the idea that genes are ‘randomly’ passed down from parents to offspring in the same way that treatment is allocated in doubleblind randomised trials. Although this assumption is generally untestable (Hernán and Robins, 2006), there are situations in which it may be deemed more plausible than the other candidate set of untestable assumptions, namely conditional exchangeability.
Mediation analysis
Approaches (such as SEM) amenable to complex causal structures have opened the way to looking beyond the causal effect of an exposure on an outcome as a black box, and to asking ‘how does this exposure act?’. For example, if income has a positive effect on certain health outcomes, does this act simply by increasing access to health care, or are there other important pathways? Addressing such questions is the goal of mediation analysis and the estimation of direct/indirect effects (see Emsley et al, 2010, for a review). This area has seen an explosion of new methodology in recent years, with several semiparametric alternatives to SEM introduced.
Suggested Introductory Reading
Hernán MA, Robins JM (to appear, 2015) Causal Inference. Chapman & Hall/CRC. [First fifteen chapters available for download here.]
Greenland S, Pearl J, Robins JM (1999) Causal diagrams for epidemiologic research.Epidemiology. 10(1):37–48.
Hernán MA, HernándezDíaz S, Werler MM, Mitchell AA (2002) Causal knowledge as a prerequisite for confounding evaluation: an application to birth defects epidemiology. American Journal of Epidemiology. 155:176–184.
Pearl J (2010) An Introduction to Causal Inference. The International Journal of Biostatistics. 6(2): Article 7.
Angrist JD, Pischke J (2009) Mostly harmless econometrics: an empiricist’s companion. Princeton University Press.
Other references
Chakraborty B, Moodie EEM (2013) Statistical methods for dynamic treatment regimes. Springer.
Daniel RM, Cousens SN, De Stavola BL, Kenward MG, Sterne JAC (2013) Methods for dealing with timedependent confounding. Statistics in Medicine. 32(9):1584–1618.
Davey Smith G, Ebrahim S (2003) ‘Mendelian randomization’: can genetic epidemiology contribute to understanding environmental determinants of disease? International Journal of Epidemiology. 32:1–22.
Dawid, AP (2000) Causal inference without counterfactuals. Journal of the American Statistical Association. 95(450):407–448.
Emsley RA, Dunn G, White IR (2010) Mediation and moderation of treatment effects in randomised controlled trials of complex interventions. Statistical Methods in Medical Research. 19(3):237–270.
Greenland S, Pearl J, Robins JM (1999) Causal diagrams for epidemiologic research.Epidemiology. 10(1):37–48.
Hernán MA, Robins JM (2006) Instruments for causal inference: an epidemiologist’s dream? Epidemiology. 17:360–372.
HernándezDíaz S, Schisterman EF, Hernán MA (2006) The birth weight “paradox” uncovered? American Journal of Epidemiology. 164: 1115–1120.
Kline RB (2011) Principles and Practice of Structural Equation Modeling, 3rd ed. The Guilford Press.
Robins JM (1986) A new approach to causal inference in mortality studies with a sustained exposure period – application to control of the healthy worker survivor effect. Mathematical Modelling. 7:1393–1512.
Robins JM, Blevins D, Ritter G, Wulfsohn M (1992) Gestimation of the effect of prophylaxis therapy for pneumocystis carinii pneumonia on the survival of AIDS patients. Epidemiology. 3:319–336.
Robins JM, Hernán MA, Brumback B (2000) Marginal structural models and causal inference in epidemiology. Epidemiology. 11:550–560.
Rosenbaum PR, Rubin DB (1983) The central role of the propensity score in observational studies for causal effects. Biometrika. 70(1):41–55.
Rubin DB (2006) Bayesian inference for causal effects: the role of randomisation. The Annals of Statistics. 6:34–58.
Recent and ongoing methodological research in Causal Inference at LSHTM
Causal mediation analysis
Researchers from LSHTM are involved in several strands of research on mediation analysis, including dealing with multiple mediators, intermediate confounding and latent variables, specifically in studies of birthweight and infant mortality.
Mediation analysis in the presence of intermediate confounders and the links with SEM
Bianca De Stavola, Rhian Daniel (LSHTM) and George Ploubidis (IoE)
Intermediate confounders, ie variables that confound the mediatoroutcome relationship and are affected by the exposure, are problematic for the decomposition of causal effects into direct and indirect components. The sufficient conditions most commonly cited for identifying natural direct and indirect effects (Pearl, 2001) include the socalled “crossworld assumption”, that conditionally on baseline confounders C, the counterfactuals Y(x,m) and M(x*) should be independent, even when x≠x*. This assumption precludes the existence of intermediate confounders. However, identification is also possible when this assumption is replaced by a weaker one (Petersen et al, 2006) namely that E{Y(1,m)Y(0,m)M(0)=m,C=c} = E{Y(1,m)Y(0,m)C=c}. Alternatively, Robins and Greenland (1992) showed that identification is also possible when this assumption is replaced by the condition that there can be no XM interaction even on an individual level, ie that, for each subject i, Y_{i}(1,m)Y_{i}(0,m) is the same for all levels of m. Both the Petersen et al assumption, and that of Robins and Greenland, can hold when intermediate confounding is present, but they imply restrictions on the form of the associational models to be fitted. In this work, we discuss these restrictions, together with further results, and insodoing clarify the link between the causal inference and SEM approaches to mediation analysis.
We have also written a routine in Stata (gformula) for estimating controlled direct effects and natural direct and indirect effects (or their randomized interventional analogues) in the presence of intermediate confounding using a fullyparametric approach via Monte Carlo simulation.
Daniel RM, De Stavola BL and Cousens SN (2011) gformula: Estimating causal effects in the presence of timevarying confounding or mediation using the gcomputation formula. The Stata Journal. 11(4):479–517.
De Stavola BL, Daniel RM, Ploubidis GB, Micali N (2015) Mediation analysis with intermediate confounding: structural equation modelling viewed through the causal inference lens. American Journal of Epidemiology. 181(1):64–80.
Mediation analysis with multiple mediators
Rhian Daniel, Bianca De Stavola and Simon Cousens (LSHTM) and Stijn Vansteelandt (University of Ghent)
The many recent contributions to the causal inference approach to mediation analysis have focused almost entirely on settings with a single mediator of interest, or a set of mediators considered en bloc; in many applications, however, researchers attempt a much more ambitious decomposition into numerous pathspecific effects through many mediators. In this work, we gave counterfactual definitions of such pathspecific estimands in settings with multiple mediators, when earlier mediators may affect later ones, showing that there are many ways in which decomposition can be done. We discussed the strong assumptions under which the effects are identified, suggesting a sensitivity analysis approach when a particular subset of the assumptions cannot be justified. The aim was to bridge the gap from “single mediator theory” to “multiple mediator practice,” highlighting the ambitious nature of this endeavour and giving practical suggestions on how to proceed.
Daniel RM, De Stavola BL, Cousens SL, Vansteelandt S (2015) Causal mediation analysis with multiple mediators. Biometrics. 71(1):1–14.
The (low) birthweight paradox
Bianca De Stavola, Richard Silverwood (LSHTM)
Overall, maternal factors such as low socioeconomic position and smoking lead to a higher incidence of infant mortality. However, this relationship has been found to be reversed for babies of low birthweight, with factors such as maternal smoking appearing protective. This has been termed the (low) birthweight paradox and various explanations have been offered. One of these is that the apparent reversal of the effect is due to unaccounted confounding between birthweight and infant mortality. We are currently investigating this phenomenon in the ONS and Scotting Longitudinal Studies. The methodological aspect of this work involves incorporating a latent class approach to account for some of the unmeasured confounding.
Longitudinal causal effects and timedependent confounding
In the setting of longitudinal data, LSHTM researchers are involved in methods for inferring shortterm and total effects, methods for use when there is strong confounding, methods for use with routinelycollected data, as well as having recently been involved in pedagogic and softwaredevelopment work (to show/hide references, ).
Daniel RM, Cousens SN, De Stavola BL, Kenward MG, Sterne JAC (2013) Methods for dealing with timedependent confounding. Statistics in Medicine. 32(9):1584–1618.
Daniel RM, De Stavola BL and Cousens SN (2011) gformula: Estimating causal effects in the presence of timevarying confounding or mediation using the gcomputation formula. The Stata Journal. 11(4):479–517.
Application of Marginal Structural Models (MSMs) with Inverse Probability of Treatment Weighting (IPTW) to primary care records in the clinical area of diabetes
Ruth Farmer, Krishnan Bhaskaran (LSHTM) and Debbie Ford (MRC CTU)
Existing research is conflicting over whether the first line treatment of metformin for type 2 diabetes is protective against the development of cancer. Within this context, time varying measures such as blood glucose level (HBa1c) and BMI are determinants of treatment, may be affected by prior treatment, and may also have an independent effect on risk of cancer. Work is ongoing to apply MSMs with IPTW to deal with time dependent confounding in this context, using data from the Clinical Practise Research Datalink (CPRD). This will be one of the first attempts to apply MSM methodology to a realworld problem in a “big data” setting. There is a particular methodological focus on how the diabetes context and use of routinely collected data may lead to violation of the underlying assumptions needed for the MSM to produce valid causal inferences, and potential solutions to this.
Inferring shortterm and total effects from longitudinal data
Ruth Keogh (LSHTM) and Stijn Vansteelandt (University of Ghent)
In longitudinal studies in which measures of exposures and outcomes are observed at repeated visits, interest may lie in studying short term or long term exposure effects. A short term effect is defined as the effect of an exposure at a given time on a concurrent outcome. Long term effects are the effects of earlier exposures on any subsequent outcome, and interest may be in two types of long term effect: (1) the total effect of an exposure at a given time on a subsequent outcome, including both indirect effects mediated through intermediate exposures and direct effects; (2) the joint effects of exposures at different time points on a subsequent outcome, which requires a separation of direct and indirect effects.
The emphasis in the statistical causal inference literature has been on studying joint effects and in particular on special methods for handling the complications of timedependent confounding which occur in this situation when timevarying confounders are affected by past outcomes, which cannot be handled by standard regression adjustment. However, investigating short term or total exposure effects provides a simpler starting point. Moreover, these effect estimates may often be the most useful, for example for a doctor making a decision about starting a patient on a treatment. In this work we have shown how, with careful control for confounding by past exposures, outcomes and timevarying covariates, short term and total effects can be estimated using conventional regression analysis. This approach is based on sequential conditional mean models (CMM) including an extension to include propensity score adjustment. We have used simulation studies to compare this approach with IPW, finding that sequential CMMs give more precise estimates than IPW and provide doublerobustness via propensity score adjustment. As part of this work we have also developed a new test of whether there are direct effects of past exposures on a subsequent outcome not mediated through intermediate exposures.
A manuscript is forthcoming.
Propensity score adjustment and strong confounding
Rhian Daniel (LSHTM) and Stijn Vansteelandt (University of Ghent)
Regression adjustment for the propensity score (p(C)=Pr(X=1C), where X is the exposure and C are confounders) is a rarelyused alternative to the other propensity score methods, namely stratification, matching and inverse weighting. In recent work, we clarified the rationale for its use, in particular for estimating the informationstandardised effect:
E{w(C)(Y^{1}Y^{0})} (*)
where w(C)=[p(C){1p(C)}] / E[p(C){1p(C)}], since in GLMs, adjustment for the propensity score leads to a consistent estimator of (*) if the propensity score model is correctly specified even if its conditional relationship with the outcome (in the GLM) has been misspecified.
The estimand (*) is attractive in settings with strong confounding, since it gives greatest weight to those in the centre of the propensity score distribution, and least weight to those who, on the basis of their confounders were nearly certain either to be exposed or not, about whom the observational data carry very little information on the treatment effect. When there is strong confounding, so that some subjects are bound to be exposed/unexposed estimating the more usual estimand E(Y^{1}Y^{0}) may be too ambitious, and anyway may not be of interest, since it requires asking what would happen if everyone were exposed, even though we know that some subjects, on the basis of their confounders, will never be exposed.
In ongoing work, we are extending this thinking to longitudinal studies where the problem of strong confounding becomes arguably even more acute. Typically, the regimes that are compared by gmethods are “always treat”, “never treat” etc. More pragmatic estimands may be sensible in situations where very few subjects have a propensity to be always/never treated.
Vansteelandt S, Daniel RM (2014) On regression adjustment for the propensity score. Statistics in Medicine; 33(23):4053–4072.
Causal inference and missing data
There are many links between the concepts and methods used in the fields of causal inference and missing data, and several LSHTM researchers are working on this intersection:
A doubly robust estimator to handle missing data and confounding simultaneously, with a focus on data from ehealth records
Elizabeth (Fizz) Williamson (LSHTM)
Fizz recently developed a doubly robust estimator that combines an element of robustness to the models used to handle missing data with an element of robustness to the models used to handle confounding. She is currently extending this estimator to more realistic scenarios, particularly those with several partially missing variables. She is also working on a series of projects investigating methods for handling missing data within propensity score analyses, with an emphasis on analyses using data drawn from electronic health records.
Williamson EJ, Forbes A, Wolfe R (2012) Doubly robust estimators of causal exposure effects with missing data in the outcome, exposure or a confounder. Statistics in Medicine. 31(30):4382–4400.
Methods for estimating treatment effect when there are departures from protocol (noncompliance and missing data) in a randomised trial
Karla DiazOrdaz (LSHTM)
Mendelian randomization
The technique of Mendelian randomization is used in applied research at LSHTM, particularly in the field of cardiovascular and other noncommunicable diseases. Alongside this, methodological work inspired by the applied problems is also carried out.
Investigating nonlinear effects with Mendelian randomization
Richard Silverwood and Frank Dudbridge (LSHTM)
Mendelian randomization studies have so far restricted attention to linear associations relating the genetic instrument to the exposure, and the exposure to the outcome, but this may not always be appropriate. For example, alcohol consumption is consistently reported as having a Ushaped association with cardiovascular events in observational studies. Richard Silverwood, Frank Dudbridge and others (Silverwood et al, 2014), proposed a novel method to assess nonlinear causal effects using a binary genotype in Mendelian randomization studies based on estimating local average treatment effects for discrete levels of the exposure range, then testing for a linear trend in those effects. Their method gave a conservative test for nonlinearity under realistic violations of the key assumption in extensive simulations, making their method useful for inferring departure from linearity when only a binary instrument is available. They found evidence for a nonlinear causal effect of alcohol intake on several cardiovascular traits in the AlcoholADH1B Consortium, using the single nucleotide polymorphism rs1229984 in ADH1B as a genetic instrument.
Silverwood RJ, Holmes MV, Dale CE, et al. (2014) Testing for nonlinear causal effects using a binary genotype in a Mendelian randomization study: application to alcohol and cardiovascular traits. International Journal of Epidemiology; B(6):178190.
Causal inference for Health Economics
One of the most active causal inference research groups at the school is led by Richard Grieve, and focuses on the area of health economics. As such, there is overlap with the CSM theme of the same name. The methodological aspects of this work are outlined below.
Causal inference approaches for handling external validity, estimating continuous treatments, and handling aspects of timevarying confounding
Richard Grieve, Noemi Kreif, Karla DiazOrdaz, Manuel Gomes, Zia Sadique (LSHTM) and Jasjeet Sekhon (UC Berkeley)
LSHTM researchers are extending causal inference approaches to: identify populating treatment effects from RCTs, estimate the effects of ‘continuous’ treatments and handle aspects of timevarying confounding in evaluating new health policies from observational data.
A general concern is that RCTs may fail to provide unbiased estimates of population average treatment effects. We have derived the requisite assumptions to identify population average treatment effects from RCTs. Our research provides placebo tests, which formally follow from the identifying assumptions and can assess whether they hold. We offer new research designs for estimating population effects that use nonrandomised studies to adjust the RCT data. This approach is illustrated in a study evaluating the clinical and costeffectiveness analysis of a clinical intervention: pulmonary artery catheterisation (see Hartman et al, 2015).
These placebo tests reveal that in some trial settings, the requisite underlying assumptions for estimating population treatment effects are not satisfied. This external validity concern was illustrated by an RCT of an intervention in primary care, ‘Telehealth’ in which only 20% of eligible patients agreed to participate. To address the external validity issue, we developed sensitivity analyses that combine RCT and observational data to reestimate treatment effects (Steventon et al, 2015).
When evaluating the effects of continuous treatments (for example according to different dosages of drug), the generalised propensity score (GPS) can be used to adjust for confounding. However, an unbiased estimation of the doseresponse function assumes that both the GPS, and the outcometreatment relationship have been correctly specified. We introduce a machine learning method, the “Super Learner” for model selection in estimating continuous treatment effects. We compare this Super Leaner approach to parametric implementations of the GPS, and to outcome regression methods, in a reanalysis of the Risk Adjustment In Neurocritical care (RAIN) cohort study. Our paper highlights the importance of principled model selection for applied empirical analysis (Kreif et al 2015).
A further strand of research considers alternative approaches for handling confounding in studies where outcomes are measured before and after an intervention. Our research contrasts the synthetic control method for the evaluation of health policies, with differenceindifferences (DiD) estimation. The synthetic control approach estimates treatment effects by constructing a weighted combination of control units, to represent outcomes the treated group would have experienced in the absence of receiving the treatment. DiD estimation assumes that pretreatment, the outcomes between the treated and control groups follow parallel trends over time, whereas the synthetic control method allows for nonparallel trends. We extend the synthetic control approach to settings where there are multiple treated units (for example hospitals), in reevaluating the effects of a recent hospital payforperformance (P4P) scheme on riskadjusted hospital mortality. Ongoing research is contrasting the synthetic control, and DiD approaches with matching and regression methods.
Hartman, E., Grieve, R., Ramsahai, R. and Sekhon, JS. (2015). From sample average treatment effect to population average treatment effect on the treated: combining experimental with observational studies to estimate population treatment effects. JRSSA doi: 10.1111/rssa.12094
Free text available from: http://onlinelibrary.wiley.com/doi/10.1111/rssa.12094/abstract
Steventon A, Grieve R, Bardsley M (2015). An approach to assess generalizability in comparative effectiveness research: a case study of the Whole Systems Demonstrator cluster randomized trial comparing telehealth with usual care for patients with chronic health conditions. Medical Decision Making (in press).
Kreif, N, Grieve, R, Díaz, I, Harrison, D (2015). Evaluation of the effect of a continuous treatment: a machine learning approach with an application to treatment for traumatic brain injury. Health Economics (in press). Submitted version available as working paper from: http://www.york.ac.uk/media/economics/documents/hedg/workingpapers/1419.pdf
The first is the average causal effect (ACE) of X on Y expressed as a marginal risk difference and the second is the average causal effect in the exposed (also called the average treatment effect in the treated, or ATT) expressed as a risk ratio (marginal wrt confounders C). The third is a conditional causal log odds ratio, given C.
Sufficient conditions for these and other similar parameters to be identified can also be expressed in terms of potential outcomes. For the ACE, for example, these are:
Consistency: For x=0,1, if X=x then Y^{x}=Y Conditional exchangeability: For x=0,1, Y^{x} ╨ X  C  Health Economics

Theme Coordinator: Richard Grieve, Linda Sharples
Please see here for slides and audio recordings of previous seminars relating to this theme.
Overview
Policymakers in many countries require accurate estimates of effectiveness and costeffectiveness, to inform clinical guidelines, and for deciding which public health interventions and health care technologies to provide. Health economic evaluations may provide misleading evidence because they fail to address issues such as low external validity, confounding, noncompliance, missing data and clustering.
Our research programme draws on insights from the causal inference literature, to propose new approaches for providing accurate estimates of effectiveness and costeffectiveness. Most of our researchers are based within the Team for Health Economics, Policy, and Technology Assessment.
Within the overall theme, the following areas are of specific interest:
Confounding
Health Economic evaluations commonly use observational studies, either alongside or instead of data from randomised controlled trials (RCTs). A major concern is that the results suffer from treatment selection bias due to confounding variables that influence both treatment and outcomes. Commonly used analytical methods for dealing with confounding such as regression analysis or propensity score matching can be highly sensitive to model specification.
We are undertaking research on improving methods in economic evaluations for dealing with confounding. Our recent and ongoing research programme in this area, covers the following topics:
 An assessment of the relative performance of a multivariate matching approach, Genetic Matching in contrast to more conventional propensity score matching estimators [14]
 An evaluation of doublerobust methods for addressing confounding in evaluations of effectiveness and costeffectiveness [56]
 An investigation of a machine learning approach for evaluating the effectiveness of continuous treatments [7]
 A critical examination of the relative merits of the synthetic control method [8] for estimating treatment effects in longitudinal settings [89]
 Comparing approaches for addressing timevarying confounding when informing sequential decisions
 Considering mediation approaches in the context of decisionmodelling and costeffectiveness analysis?
Key people: Richard Grieve, Noemi Kreif, Stephen O’Neill, Neil Hawkins.
Noncompliance
Recommendations encourage costeffectiveness analyses (CEAs) to report intention to treat (ITT) estimates, but this may be insufficient for policy making as new decision problems may arise subsequent to the trial design. Clinical decisionmakers also require costeffectiveness estimates for patients who meet particular treatment thresholds, or for lower levels of compliance, which better reflect routine practice. Per protocol (PP) analyses are common but will provide biased estimates.
We aim to improve methods for estimating causal treatment effects after deviation from protocol. In particular, the instrumental variables (IV) approach we propose can deal with noncompliance in settings where there are multiple endpoints.
Key people: Richard Grieve
External validity
Evaluations that use RCTs lack external validity if the RCT and the target population differ according to patient and provider characteristics that modify relative costeffectiveness. We lead NIHRfunded evaluations where clinical investigators suggest the potential lack of external validity could stop them applying the results in practice. We have developed methods that make plain the assumptions required to generalise effectiveness and costeffectiveness estimates from an RCT, to target populations of prime interest [10].
Key people: Richard Grieve
Missing data
In economic evaluations a common problem is that there are missing data, for example, because patients are lost to followup or fail to respond to qualityoflife or resource use questionnaires. Missing data may be problematic because individuals with missing information tend to be systematically different from those with complete data. Most published studies fail to address this issue and report costeffectiveness inferences based solely on the complete cases. Inappropriate methods will lead to biased results, and ultimately can affect the decision of whether an intervention should be prioritised.
While standard multiple imputation methods have been proposed for handling missing data in costeffectiveness analysis, these may be insufficient in many settings. For example, they assume that individual observations are independent (which may be implausible in multicentre studies or metaanalysis of individualparticipant data) or that the imputation model is correctly specified. In addition, the methods proposed assumed that data are missing at random, i.e. the probability of missingness is only conditional on the observed data. However, the probability of missing costs or outcomes may depend on unobserved values, i.e. data may be missing not at random.
We focus on the following areas:
 Comparing different approaches for dealing with clustering when addressing missing data in CEA [11].
 Assessing novel ‘doubly robust’ approaches that minimise the reliance on model specification when handling missing data in CEA that use nonrandomised studies.
 Delineating strategies for dealing with costeffectiveness data missing not at random.
 Developing tools to elicit expert opinion to inform sensitivity analysis to assumptions about the missing data in CEA studies.
 Investigating how untestable assumptions about data missing not at random link multiple imputation and Heckman selection models.
Key people: Alexina Mason, Manuel Gomes, Richard Grieve
Clustering
CEAs often use data from cluster randomised trials (CRTs), where randomisation is at the level of the cluster (for example the hospital) rather than the individual. Here, statistical methods are required that recognise withincluster costs and health outcomes may be correlated. However, most CEA alongside CRTs use methods that assume observations are independent, which may lead to incorrect inferences.
Our research interests include:
 Developing appropriate methods for CEA that use CRTs and provide guidance on their use [1213]
 Investigating approaches for handling missing values that recognise the hierarchical structure of the data such as randomeffects and fixedeffects multiple imputation [14]
Key people: Manuel Gomes, Richard Grieve
References
1. Sekhon, Jasjeet. Multivariate and propensity score matching with automated balance search. Journal of Statistical Software 2011
2. Sekhon, Jasjeet, and Richard D. Grieve. “A matching method for improving covariate balance in cost‐effectiveness analyses.”Health economics 21, no. 6 (2012): 695714.
3. Radice, Rosalba, Roland Ramsahai, Richard Grieve, Noemi Kreif, Zia Sadique, and Jasjeet S. Sekhon. “Evaluating treatment effectiveness in patient subgroups: a comparison of propensity score methods with an automated matching approach.”The international journal of biostatistics8, no. 1 (2012).
4. Kreif, Noemi, Richard Grieve, Rosalba Radice, Zia Sadique, Roland Ramsahai, and Jasjeet S. Sekhon. “Methods for estimating subgroup effects in costeffectiveness analyses that use observational data.”Medical Decision Making 32, no. 6 (2012): 750763.
5. Kreif, Noémi, Richard Grieve, Rosalba Radice, and Jasjeet S. Sekhon. “Regressionadjusted matching and doublerobust methods for estimating average treatment effects in health economic evaluation.”Health Services and Outcomes Research Methodology13, no. 24 (2013): 174202.
6. Kreif, Noémi, Susan Gruber, Rosalba Radice, Richard Grieve, and Jasjeet S. Sekhon. “Evaluating treatment effectiveness under model misspecification: a comparison of targeted maximum likelihood estimation with biascorrected matching.”Statistical methods in medical research(2014): 0962280214521341.
7. Kreif, Noémi, Richard Grieve, Iván Díaz, and David Harrison. “Evaluation of the Effect of a Continuous Treatment: A Machine Learning Approach with an Application to Treatment for Traumatic Brain Injury.”Health economics24, no. 9 (2015): 12131228.
8. Abadie, Alberto, Alexis Diamond, and Jens Hainmueller. “Synthetic control methods for comparative case studies: Estimating the effect of California’s tobacco control program.”Journal of the American Statistical Association105, no. 490 (2010).
9. Kreif, Noémi, Richard Grieve, Dominik Hangartner, Alex James Turner, Silviya Nikolova, and Matt Sutton. “Examination of the Synthetic Control Method for Evaluating Health Policies with Multiple Treated Units.”Health economics (2015). doi: 10.1002/hec.3258.
10. Hartman, Erin, Richard Grieve, Roland Ramsahai, and Jasjeet S. Sekhon. “From sample average treatment effect to population average treatment effect on the treated: combining experimental with observational studies to estimate population treatment effects.”Journal of the Royal Statistical Society: Series A (Statistics in Society)178, no. 3 (2015): 757778.
11. Díaz‐Ordaz, Karla Michael G. Kenward, and Richard Grieve. “Handling missing values in cost effectiveness analyses that use data from cluster randomized trials.”Journal of the Royal Statistical Society: Series A (Statistics in Society)177, no. 2 (2014): 457474.
12. Gomes, Manuel, Edmond SW. Ng, Richard Grieve, Richard Nixon, James Carpenter, and Simon G. Thompson. “Developing appropriate methods for costeffectiveness analysis of cluster randomized trials.”Medical Decision Making 32, no. 2 (2012): 350361.
13. Gomes, Manuel, Richard Grieve, Richard Nixon, Edmond S‐W. Ng, James Carpenter, and Simon G. Thompson. “Methods For Covariate Adjustment In Cost‐Effectiveness Analysis That Use Cluster Randomised Trials.”Health economics21, no. 9 (2012): 11011118.
14. Ng, Edmond SW, Karla DiazOrdaz, Richard Grieve, Richard M. Nixon, Simon G. Thompson, and James R. Carpenter. “Multilevel models for costeffectiveness analyses that use cluster randomised trial data: an approach to model choice.”Statistical methods in medical research(2013): 0962280213511719.
 Missing Data and Measurement Error

Theme Coordinators: Ruth Keogh, James Carpenter, Karla DiazOrdaz, Chris Frost
Please see here for slides and audio recordings of previous seminars relating to this theme.
Missing data
Background
The problem of missing data is almost ubiquitous in medical research, in both observational studies and randomized trials. Until the advent of sufficiently powerful computers, much of the research in this area was focused on the problem of how to handle, in a practicable way, the lack of balance caused by incompleteness. A example of such a development was the key idea of the EM algorithm (Dempster et al 1976). As routine computation became less of a problem, attention moved to the much more subtle issue of the consequences of missing data on the validity of subsequent analyses. The seminal work was Rubin (1976), from which all subsequent work in this area has developed to a greater or lesser degree.
Although the underlying missing data concepts are the same for observational and randomized studies, the emphases differ somewhat in practice in the two areas. However, both are the subject of development within the Centre. From 2002, supported by several grants from the Economic and Social Research Council, an entire programme has been developed around the handling of missing data in observational studies. This includes the development of multiple imputation in a multilevel setting (e.g. Goldstein et al 2009, Carpenter et al 2010), a series of short courses, and the establishment of a leading website devoted to the topic:
which contains background material, answers to frequently asked questions, course notes, software, details of upcoming courses and events, a bibliography, and a discussion forum.
A central problem in the clinical trial setting is the appropriate handling of dropout and withdrawal in longitudinal studies. This has been the subject of great debate among academics, trialists and regulators for the last 1015 years. Members of the centre have had long involvement in this (e.g. Diggle and Kenward 1994, Carpenter et al 2002). A textbook was published by Wiley on the broad subject of missing data in clinical studies (Molenberghs and Kenward 2007). More recently the UK NHS National Coordinating Centre for Research on Methodology commissioned a monograph on the subject which was published in 2008 (Carpenter and Kenward 2008). Members of the Centre are also actively involved in current regulatory developments. Two important documents have recently appeared. In the US an FDA commissioned National Research Council Panel on Handling Missing Data in Clinical Trials, chaired by Professor Rod Little, produced in 2010 a report, ‘The Prevention and Treatment of Missing Data in Clinical Trials.’ James Carpenter was one of several experts invited to give a presentation to this panel. Implementation of the guidelines in this report is to be discussed at the 5th Annual FDA/DIA Statistics Forum in April 2011, where Mike Kenward is giving the one day premeeting tutorial on missing data methodology. In Europe, again in 2010, the CHMP released their ‘Guideline on Missing Data in Confirmatory Clinical Trials’. James Carpenter, Mike Kenward and James Roger were members of the PSI working party that provided a response to the draft of this document (Burzykowski T et al. 2009).
At the School there continues a broad research programme in both the observational study and randomized trials settings, and there is an active continuing programme of workshops. Missing data is an issue for many of the studies run and analysed within the School and there is much crossfertilization across different research areas. There are also strong methodological links with other themes, especially causal inference, indeed one recent piece of work explicitly connects the two areas (Daniel et al. 2011).
People
Those most directly involved in missing data research are:
Jonathan Bartlett, James Carpenter, Mike Kenward, James Roger (honorary), and two research students: Mel Smuk and George Vamvakis.
Many others have an interest in, and have contributed to, the area, including Rhian Daniel, Bianca de Stavola, George Ploubidis, and Stijn Vansteelandt (honorary).
References
Burzykowski T et al. (2009). Missing data: Discussion points from the PSI missing data expert group. Pharmaceutical Statistics. DOI: 10.1002/pst.391
Carpenter JR, Goldstein H and Kenward MG (2010). REALCOMIMPUTE software for multilevel multiple imputation with mixed response types. Journal of Statistical Software, to appear.
Carpenter JR and Kenward MG (2008). Missing data in clinical trials – a practical guide. National Health Service Coordinating Centre for Research Methodology: Birmingham. Downloadable from http://www.haps.bham.ac.uk/publichealth/methodology/docs/invitations/Fi….
Carpenter J, Pocock S and Lamm C (2002). Coping with missing values in clinical trials: a model based approach applied to asthma trials Statistics in Medicine, 21, 10431066.
Daniel RM, Kenward MG, Cousens S, de Stavola B (2009) Using directed acyclic graphs to guide analysis in missing data problems. Statistical Methods in Medical Research, to appear.
Dempster AP Laird NM and Rubin DB (2007). Maximum likelihood from incomplete data via the EM algorithm (with discussion). Journal of the Royal Statistical Society, Series B, 39, 138.
Diggle PJ and Kenward MG (1994). Informative dropout in longitudinal data analysis (with discussion). Applied Statistics, 43, 4994.
Goldstein H, Carpenter JR, Kenward MG and Levin K (2009). Multilevel models with multivariate mixed response types. Statistical Modelling, 9, 173197.
Molenberghs G and Kenward MG (2007). Missing Data in Clinical Studies. Chichester: Wiley.
Rubin DB (1976). Inference and missing data. Biometrika, 63, 581592.
Measurement Error
Background
The measurement of variables of interest is central to epidemiological study. Often, the measurements we obtain are noisy errorprone versions of the underlying quantity of primary interest. Such errors can arise due to technical error induced by imperfect measurement instruments and shortterm fluctuations over time. An example is a single measurement of blood pressure, considered as a measure of an individual’s underlying average blood pressure. Variables obtained by asking individuals to answer questions about their behaviour or characteristics are also often subject to error, either due to the individual’s inability to accurately recall the behaviour in question or a tendency, for whatever reason, to overestimate or underestimate the quantity being requested.
The consequences of measurement error in a variable depend on the variable’s role in the substantive model of interest (Carroll et al). For example, independent error in the continuous outcome variable in a linear regression does not cause bias. In contrast, measurement error in the explanatory variables of regression models does cause bias, in general. Measurement error in an exposure of interest may distort estimates of the exposures effect on the outcome of interest, while error in confounders will lead to imperfect adjustment for confounding, leading to biased estimates of the effect of an exposure.
When explanatory variables in regression models are categorical the analogy of measurement error is misclassification. Unlike measurement errors, which can often plausibly be assumed to be independent of underlying true levels, a misclassification error is never independent of the underlying value of the predictor variable and so different theory covers the effects of misclassification and measurement errors (White et al).
Over the past thirty years a vast array of methods has been developed to accommodate measurement errors and misclassification in statistical analysis models. While simple methods include method of moments correction and regression calibration have sometimes been applied in epidemiological research, more sophisticated approaches, such as maximum likelihood (Bartlett et al) and semiparametric methods (Carroll et al), have received less attention. This is likely partly due to a relative scarcity of implementation in statistical software packages.
Areas for future research efforts
Greater recognition of the effects of measurement error and misclassification in the analysis of epidemiological and clinical studies.
Increasing the accessibility of methods to deal with measurement error, through dissemination of methods and the implementation of methods into statistical software.
Development of methods that allow for the effects of measurement errors in causal models that describe how risk factors, and therefore risks of disease, change over time.
References
Bartlett J. W., De Stavola B. L., Frost C. (2009). Linear mixed models for replication data to efficiently allow for covariate measurement error. Statistics in Medicine; 28: 31583178.
Carroll R. J., Ruppert D., Stefanski L. A., Crainiceanu C. M. (2006). Measurement error in nonlinear models. Chapman & Hall/CRC, Boca Raton, FL, US.
Frost C., Thompson S. G. (2000). Correcting for regression dilution bias: comparison of methods for a single predictor variable. Journal of the Royal Statistical Society A; 163: 173189.
Frost C., White I. R. (2005). The effect of measurement error in risk factors that change over time in cohort studies: do simple methods overcorrect for `regression dilution’?. International Journal of Epidemiology; 34: 13591368.
Gustafson, P. (2003). Measurement Error and Misclassification in Statistics and Epidemiology: Impacts and Bayesian Adjustments. Chapman and Hall/CRC Press.
White I., Frost C., Tokunaga S. (2001). Correcting for measurement error in binary and continuous variables using replicates. Statistics in Medicine; 20:34413457
Knuiman M. W., Divitini M. L., Buzas J. S., Fitzgerald P. E. B. (1998). Adjustment for regression dilution in epidemiological regression analyses. Annals of Epidemiology; 8: 5663.
 Survival Analysis

Theme Coordinators: Bernard Rachet, Aurelien Belot
Please see here for slides and audio recordings of previous seminars relating to this theme.
Background
Survival analysis is at the core of any study of time to a particular event, such as death, infection, or diagnosis of a particular cancer. It is therefore fundamental to most epidemiological cohort studies, as well as many randomised controlled trials (RCTs).
An important issue in survival analysis is the choice of time scale: this could be for example time since entry into the study (or since first treatment in a RCT), time since a particular event (e.g. the Japanese tsunami), or time since birth (i.e. age). The latter is particularly relevant for epidemiological studies of chronic diseases, where age often exerts a substantial confounding effect (see [1], Chapter 6, for a discussion of alternative time scales).
Usually not all participants are followed up until they experience the event of interest, leading to their times being ‘censored‘. In this case, the available information consists only of a lower bound for their actual event time. It is typically assumed that the process giving rise to censoring is independent of the process determining time to the event of interest. In contrast to most regression approaches (which typically involve modelling means of distributions given explanatory variables), many survival analysis models are defined in terms of the hazard (or rate) of the event of interest. Within this framework, the hazard is expressed as a function of explanatory variables and an underlying ‘baseline’ hazard. Fully parametric models assume a particular form for the baseline hazard, the simplest being that it is constant over time (Poisson regression). Cox’s proportional hazards model, perhaps the most popular model for survival data, makes no parametric assumptions about the baseline hazard. Both the Poisson and Cox regression models assume the hazards to be proportional for individuals with different values of the explanatory variables. This assumption can be relaxed, for example through use of Aalen’s additive hazard model.
Generalizations to deal with repeated episodes of an event of interest, such as infection, are possible through the introduction of random effects that capture the correleation among events that occur to the same individual. Within the survival analysis literature these are referred to as frailty models. → Design and analysis for dependent data
An alternative approach to modelling survival data, more in keeping with most regression techniques, involves modelling the (logarithmically transformed) survival times directly. These are expressed in terms of as a linear function of explanatory variables and an error term, with a choice of distributions for the error terms leading to the family of accelerated failure time models. When the errors are assumed to be exponential, the accelerated failure time model is equivalent to a Poisson regression model.
Most of our applications of survival analysis models involve various flavours of the models mentioned above. However specific issues arise in certain contexts and are of interest to our group. These are discussed below.
Areas of current interest
Competing events
Censoring may occur for several reasons. A particular setting where censoring is not independent of the process governing the event of interest arises when there are competing events. Competing events are events that remove the individual from being at risk of the event of interest, in other words they preclude its occurrence. This happens for example if we study lung cancer mortality while individuals may die of other causes. Obviously the termination of the followup of individuals who die from other causes is not the same as loss to followup because the latter does not prevent the occurrence of the event of interest after time is censored.
The issues and methods arising for the analysis of competing events have been discussed in the biostatistical literature since the 1980s, (for a review see [2]) but have not really filtered into epidemiological practice, with the notable exception of applications to AIDS research [3]. They are only marginally discussed in the RCT literature, where the problem is usually dealt with by creating composite events. → Analysis of clinical trials
There are two main possible approaches to the analysis of data affected by competing events:
a) Carrying out a socalled ‘causespecific‘ analysis, that is adopt traditional survival analysis methods where competing events are treated as censoring events. Note however that ’causespecific’ in this context is a misnomer since the estimated effect depends on the rates generating all the other events (see [1], page 66). The main issue with this is approach is one of interpretation, as all estimated effects are conditional on suffering the competing event.
b) Adopting a different focus, that is model the cumulative incidence of the event of interest as opposed to its hazard (or rate). This approach was first proposed by Fine and Gray [4] but belongs to the broader family of inverse probability weighting (IPW) estimators (e.g. [5]) that has also been proposed in other contexts, notably to deal with informative missingness and selection bias [67]. → Causal inference, Missing data and measurement error
Net survival
Information on cancer survival is essential for cancer control and has important implications for cancer policy. The primary indicator of interest is net survival, a conceptual survival metric which would be observed if the patients were only subject to the mortality from the disease of interest and the mortality rate of this disease remained as in the context of analyses involving competing events, the only situation which can be observed.
Two approaches attempt to estimate net survival: causespecific survival and relative survival. Relative survival [8] is the standard approach of estimating populationbased cancer survival, when the actual cause of death is not accurately known. Although widely used in the cancer field, it can be applied to any disease at population level. Relative survival was originally defined as the ratio of the observed survival probability of the cancer patients and the survival probability that would have been expected if the patients had had the same mortality probability as the general population (background mortality) with similar demographic variables e.g. age, sex, calendar year. Background mortality is derived from life tables stratified at least by age, sex and calendar time.
Unbiased estimator of net survival
Both approaches (causespecific and relative survival) provide biased estimation of net survival because of the competitive censoring in particular due to age. An unbiased descriptive estimator of net survival using the principle of inverse probability weighting has been recently proposed alongside the modelling approach (PoharPerme M, Stare J, Estève J. Biometrics 2011 – in review].Multivariable excess hazard models
Relative survival is the survival analogue of excess mortality. Additive regression models for relative survival estimate the hazard at time t since diagnosis of cancer, as the sum of the expected hazard (background) of the general population at time t, and the excess hazard due to cancer [911]. More flexible models using splines for modelling the baseline excess hazard function of death as well as the nonproportionality of the covariables effects have been recently developed [1214]; modelling the logcumulative excess hazard has been also proposed [1516]. Alternative approaches were recently developed [17].Unbiased estimation of net survival requires the inclusion of the main censoring variables in the excess hazard models, variables usually included in the life tables [18].
Current Work
Life tables
Estimation of net survival relies on accurate life tables. Methodology based on multivariable flexible Poisson model has been developed in order to build complete, smoothed life tables for subpopulations, as defined by region, deprivation, ethnicity etc. [19].Survival on sparse data
Contrasting with incidence and mortality, very little has been done on the estimation of survival based on sparse data or small areas [20]. The main challenge in survival is the additional dimension that is time since diagnosis. Multilevel modelling and Bayesian approaches are two main possible routes. Ultimately, presentation of such survival results can easily mislead healthcare policy makers and methodological work on mapping and funnel plots is needed [21].Public health relevance
Several indicators (avoidable deaths, population ‘cure’ parameters, crude probability of death, partitioned excess mortality) have been explored to present cancer survival results in ways more relevant for public health and health policy.Missing data and misclassification
The analysis of routine, populationbased data always face the problem of incomplete data for which it may be difficult or impossible to obtain the required complementary information. A tutorial paper explored the estimation of relative survival when the data are incomplete [22]. Even when complete, tumour stage in particular may be misclassified, compromising comparison in cancer survival between subpopulations.Disparities in cancer survival
Inequalities in cancer survival are still not well understood and structural equation modelling appears to be a possible approach to investigate potential causal pathways.References
1. Clayton D and Hills M. Statistical Models in Epidemiology. Oxford University Press, 1993, Oxford.
2. Putter, H., Fiocco, M., and Geskus, R. B. Tutorial in biostatistics: Competing risks and multistate models. Statistics in Medicine. 2007: 26, 2389–2430.
3. CASCADE Collaboration. Effective therapy has altered the spectrum of cause specific mortality following HIV seroconversion. AIDS, 2006, 20:741–749
4. Fine, JP and Gray R J. A proportional hazards model for the subdistribution of a competing risk. Journal of the American Statistical Association. 1999: 94, 496–509.
5. Klein JP, Andersen PK. Regression Modeling of Competing Risks Data Based on Pseudovalues of the Cumulative Incidence Function. Biometrics 2005: 61, 223–229.
6. Robins JM, et al. Semiparametric regression for repeated outcomes with nonignorable nonresponse. Journal of the American Statistical Association. 1998; 93 13211339.
7. Hernán MA, HernandezDiaz S, Robins JM. A structural approach to selection bias. Epidemiology 2004;15:615625.
8. Ederer F, Axtell LM, Cutler SJ. The relative survival: a statistical methodology. Natl Cancer Inst Monogr 1961; 6: 10121.
9. Hakulinen T, Tenkanen L. Regression analysis of relative survival rates. J Roy Stat Soc Ser C 1987; 36: 30917.
10. Estève J, Benhamou E, Croasdale M, Raymond L. Relative survival and the estimation of net survival: elements for further discussion. Stat Med 1990; 9: 52938.
11. Dickman PW, Sloggett A, Hills M, Hakulinen T. Regression models for relative survival. Stat Med 2004; 23: 5164.
12. Bolard P, Quantin C, Abrahamowicz M, Estève J, Giorgi R, ChadhaBoreham H, Binquet C, Faivre J. Assessing timebycovariate interactions in relative survival models using restrictive cubic spline functions. J Cancer Epidemiol Prev 2002; 7: 11322.
13. Giorgi R, Abrahamowicz M, Quantin C, Bolard P, Estève J, Gouvernet J, Faivre J. A relative survival regression model using Bspline functions to model nonproportional hazards. Stat Med 2003; 22: 276784.
14. Remontet L, Bossard N, Belot A, Estève J, FRANCIM. An overall strategy based on regression models to estimate relative survival and models to estimate relative survival and model the effects of prognostic factors in cancer survival studies. Stat Med 2007; 26: 221428.
15. Nelson CP, Lambert PC, Squire IB, Jones DR. Flexible parametric models for relative survival, with application in coronary heart disease. Stat Med 2007; 26: 548698.
16. Lambert PC, Royston P. Further development of flexible parametric models for survival analysis. Stata J 2010; 9: 26590.
17. Perme MP, Henderson R, Stare J. An approach to estimation in relative survival regression. Biostatistics 2009; 10: 13646.
18. Estève J, Benhamou E, Raymond L. Statistical methods in cancer research, volume IV. Descriptive epidemiology. (IARC Scientific Publications No. 128). Lyon: International Agency for Research on Cancer, 1994.
19. Cancer Research UK Cancer Survival Group. Life tables for England and Wales by sex, calendar period, region and deprivation. http://www.lshtm.ac.uk/ncdeu/cancersurvival/tools/, 2004.
20. Quaresma M, Walters S, Gordon E, Carrigan C, Coleman MP, Rachet B. A cancer survival index for Primary Care Trusts. Office for National Statistics, 7 Sep 2010. http://www.statistics.gov.uk/statbase/Product.asp?vlnk=15388
21. Spiegelhalter DJ. Funnel plots for comparing institutional performance. Statistics in Medicine 2005; 24: 1185202.
22. Nur U, Shack LG, Rachet B, Carpenter JR, Coleman MP. Modelling relative survival in the presence of incomplete data: a tutorial. IJE 2010; 39: 11828.
 Time Series Regression Analysis

Theme Coordinators: Antonio Gasparrini, Ben Armstrong
Please see here for slides and audio recordings of previous seminars relating to this theme.
This page is split into the following sections:
 Time series analysis for biomedical data
 Methodological issues
 Contributions of LSHTM researchers
 LSHTM people involved in developing or using time series regression methodology
 Publications by LSHTM researchers
 Key references on methods
1. Time series analysis for biomedical data
A time series may be defined as a sequence of measurements taken at (usually equallyspaced) ordered points in time.
Statistical methods applied to time series data were originally developed mainly in econometrics, and then used in many other fields, such as ecology, physics and engineering. In the original application the focus was in prediction, and the aim was to produce an accurate forecast of future measurements given an observed series. The standard statistical approaches adopted for this purpose usually rely on autoregressive moving average (ARIMA) and related models.
Time series designs are increasingly being exploited in biomedical data, due to the availability of routinelycollected series of administrative or medical data, such as mortality or morbidity counts, environmental measures, changes in socioeconomic or demographic indices. Within this research area, time series methods have been subject to intense methodological developments over the last 20 years. In contrast with the original interest on prediction, the main aims of time series analysis in biomedical applications is commonly to assess the association between an outcome and either a predictor series or an intervention: here the focus is instead in estimation, and the models reduce to the more traditional regression framework although possibly nonstandard versions.
Two main features characterize time series data from a statistical viewpoint: the correlation displayed by observations and their temporal sequence. Statistical models need to cope with the former, in order to provide accurate inferences, and may exploit the latter, with the intention to strengthen the evidence on the causal nature or clarify details of the association under study.
2. Applications of time series regression
2.1 Time series regression of shortterm associations
A topic of intense methodological research and applications of time series analysis is the study of shortterm health associations. In particular, time series methods have been hugely applied in environmental epidemiology during the last decades to investigate the acute health effects of air pollution, and more recently outdoor temperature and other weather parameters. This approach exploits wellknown decomposition techniques of time series data, which filter out longterm and seasonality trends in the analysis of shortterm dependencies between timevarying environmental factors and health outcomes. This method controls by design for timefixed factors or other confounders that change slowly in time.
Time series studies of shortterm associations compare the outcome and exposure series, such as in the example below illustrating the daily variation in mortality counts and outdoor temperature in a 14 years period in New York. The main methodological issues in this approach are selection of smoothing methods for the decomposition of the series, the presence and estimation of delayed effects and the potential confounding by other timevarying factors.
2.2 Interrupted time series for evaluating interventions or events
The importance of robust evaluation of public health interventions is increasingly recognised, yet public health interventions are often complex and evaluation methods used in clinical medicine (such as randomised controlled trials) are not always feasible. Other ‘quasiexperimental’ designs are therefore needed in order to explore the effect of an intervention on health outcomes, one of the strongest of which is the interrupted time series (ITS) design. ITS requires a series of observations taken repeatedly over time before and after an intervention. The underlying trend in the outcome is established and can be used to estimate the counterfactual, that is, what would have happened if the intervention had not taken place. The impact of the intervention is then assessed by examining any change in the postintervention period given the trend in the preintervention period. The intervention may lead to a change in level, a change in slope or both. This framework is illustrated in the figure below.
Interrupted time series can be used to explore the impact of public health interventions or unplanned events. In the example illustrated in the figure below, an ITS design is adopted to assess the impact of the financial crisis in Spain on suicides. The main methodological issues surrounding the ITS design are represented by assumptions and methods to model trends and control for potential timevarying confounders of the beforeafter comparison.
Trend in monthly suicide rates for all of Spain before and since the financial crisis (Lopez Bernal et al 2013)
3. Methodological issues
The regression analysis of time series biomedical data poses several methodological problems, which result in an intense research carried out in the last few years. The main research directions are summarized below. References are provided in the related sections.
 Model selection: time series model are usually built with a predefined set of potential confounders. However, some criteria are needed to select other model parameters, such as the degree of control for seasonal and long time trends, or the adequacy of assumptions on the shape of the exposureresponse relationship of predictors showing potential nonlinear effects. Some investigators have tested the comparative performance of selection criteria based on information criteria (Akaike, Bayesian or related), minimization of partial autocorrelation of residuals, (generalized) crossvalidation and others. Further research is needed to produce robust and general selection criteria.
 Smoothing methods: the specification of nonlinear exposureresponse relationship for predictors in the regression model is essential both to determine the association with the exposure of interest and to control for potential confounders. Smoothing techniques based on both parametric and nonparametric methods have been proposed in time series analysis. The former usually rely on regression splines within generalized linear models (GLM), while the latter are specified through smoothing or penalized splines within generalized additive models (GAM).
 Distributed lag (nonlinear) models: commonly the effect of an exposure is not limited to the day it occurs, but persists for further days or weeks. This introduces the additional problem of modelling the lag structure of the exposureresponse relationship. This issue has been initially addressed by distributed lag models, which allows the linear effect of a single exposure event to be distributed over a specific period of time. More recently, this methodology has been generalized to nonlinear exposureresponse relationships through distributed lag nonlinear models, a modelling framework which can flexibly describe simultaneously nonlinear and delayed associations.
 Harvesting effect (mortality displacement): this phenomenon arises when applying an ecological time series analysis to grouped data, for example mortality counts. The conceptual framework is based on the assumption that the exposure can affects mainly a pool of frail individuals, whose events are only brought forward by a brief period of time by the effect of exposure. For nonrecurrent outcomes, the depletion of the pool following a high exposure event results in some reduction of cases few days later, thereby reducing the overall longterm impact (see figure below). Specific models are needed to account for this reduction in the overall effect and thereby produce accurate estimates.
 Twostage analysis: the usual approach to time series studies on environmental factors involves the analysis of series from multiple cities or regions. The complexity of the regression models prevents the specification of a very highly parameterized hierarchical structure in a single multilevel development. The analysis is instead carried out through a twostage step, with a common cityspecific model and then a metaanalysis to pool the results. The specification of complex exposureresponse relationships in the first stage requires the development of nonstandard metaanalytic techniques, such as metasmoothing and multivariate metaanalysis.
 Timevarying confounders: Whilst interrupted time series designs are rarely affected by normal confounders, such as differences in socioeconomic status or age composition, which typically only change relatively slowly over time, they may be affected by time varying confounders. This is particularly an issue if the confounders are unmeasured and change over the same period as the intervention, for example other concurrent events or policies. Design adaptations may be introduced to address this limitation such as the introduction of a control series, multiple baseline designs (where the intervention is introduced in different locations at different times) and multiple phases (where the intervention is first introduced then removed to test whether the effect is reversed).
4. Contributions of LSHTM researchers
4.1 Methodological innovations:
Statisticians at the LSHTM have made contributions to time series regression methodology either in in explicitly methodological papers or in innovations published in reports of substantive epidemiological studies. A tutorial paper by several LSHTM researchers provides an overview of methods(Bhaskaran et al. 2013). Another paper summarizes several issues as potential candidates for methodological work, focusing in particular on temperaturehealth associations (Gasparrini and Armstrong 2010).
Distributed lag nonlinear models. Several published methodological articles have proposed a new more flexible way to model lagged relationships, through the framework of distributed lag nonlinear models (Armstrong 2006; Gasparrini et al. 2010), implemented in the R package dlnm (Gasparrini 2011) – see figure. Later work presents methods for estimating attributable number of deaths from distributed lag nonlinear models (Gasparrini and Leone 2014; Gasparrini et al. 2015).
Two stage analyses. Other methodological efforts have explored ways to pool and explore heterogeneity in estimates of nonlinear exposureresponse relationships in twostage analyses. The methods are based on multivariate metaanalytical techniques applied to estimates of multiparameter associations from firststage models, and implemented in the R package mvmeta (Gasparrini et al. 2011, 2012a) – see figure.
Modifiers of exposureresponse associations. The modification of associations by area characteristics has been investigated as metaregression using the twostage approach discussed above (Tobías et al. 2014). Modification by individual characteristics (age, SES) varying within areas has been explored using interaction terms in simple models (Hajat et al. 2007). We have also proposed a version of the “caseonly” approach designed originally for studying geneenvironment interactions in time series context, to study how effects of timevarying risk factors (e.g. weather) might be modified by timefixed factors, such as age or socioeconomic status (Armstrong 2003).
Heat waves. Two other papers have explored models to allow estimation of the extent to which the excess deaths associated with heat waves can be explained by a continuous association between temperature and mortality, or whether rather an additional “wave effect” due to sustained heat is necessary (Gasparrini and Armstrong 2011; Hajat et al. 2006). Other work has developed ways to compare how performance of heathealth warning systems depend on heat wave definitions (Gasparrini et al. 2010).Mortality displacement (harvesting). Approaches have also been developed and applied to identify extent of short term “harvesting” (see above) (Hajat et al. 2005; Rehill et al. 2015).
Interrupted time series (ITS). A tutorial paper introduces the ARIMA and segmented regression approaches to ITS (Lagarde 2011). Other applied research has introduced methods innovations. A paper has evaluated the influence of alternative modelling assumptions on the estimate of the association between the introduction of statewide smoking bans and the incidence of acute myocardial infarction (Gasparrini et al. 2009). Other papers have pioneered the use of conditional Poisson models for multiple ITS designs (Grundy et al. 2009) and methods for controlled ITS (Milojevic et al. 2012).
Other recent and ongoing methodological work. A recent paper introduced the use of conditional Poisson models for the casecrossover and related formulations of time series (Armstrong et al. 2014). Methodological work continues, focused in particular on extending ways of characterising variation in distributed lag nonlinear models across cities or subpopulations, and on adapting time series regression methods for infectious diseases (Imai et al. 2014)
4.2 Applied research:
The substantive research using time series regression methods carried out at the LSHTM or in which LSHTM researcher have collaborated has concerned mainly the associations between daily occurrences of health outcomes (such as deaths) and timevarying environmental factors. Earliest examples (Gouveia and Fletcher 2000) concerned associations of daily air pollution on mortality, and this interest continues (Milojevic et al. 2014; Pattenden et al. 2010). But most focus has been on associations of weather and season with health – of particular interest in the context of impending global warming. The most common health outcome has been mortality (Armstrong et al. 2010; Gasparrini et al. 2012b; Hajat et al. 2002; McMichael et al. 2008) but also: hospital admissions (Pudpong and Hajat 2011), GP visits (Hajat and Haines 2002), viral disease (Lopman et al. 2009), foodborne disease (Kovats et al. 2004; Tam et al. 2006), diarrhoea (Hashizume et al. 2007, 2008a; Hashizume et al. 2010), pregnancy outcome (Lee et al. 2008; Wolf and Armstrong 2012), myocardial infarctions (Bhaskaran et al 2010; Bhaskaran et al. 2011; Bhaskaran et al. 2012)); defibrilator activation (McGuinn et al. 2012).
Several studies have focused in particular on which groups are vulnerable to the acute effects identified in time series regression, in particular of weather (Hajat et al. 2007; Hajat and Kosatky 2010; Hashizume et al. 2008b; Wilkinson et al. 2004), but also those of limited daylight on injuries (Steinbach et al. 2014) . Others have predicted, from time series regressions, impact of climate change on deaths due to acute effects of heat and cold (Hajat et al. 2014; Vardoulakis et al. 2014) .
Time series regression methods have also been used to study association of circulating RSV and influenza with hospital admission (Mangtani et al. 2006) and how much vaccination reduces that association with mortality (Armstrong et al. 2004).
Studies applying interrupted time series methods include those exploring the association of the introduction of statewide smoking bans with the cardiovascular morbidity (BaroneAdesi et al. 2011), the financial crisis with suicides rates in Spain (Lopez Bernal et al. 2013), 20 mph speed limits with road injuries (Grundy et al. 2009), and floods with mortality (Milojevic et al. 2011; Milojevic et al. 2012).
For other and in particular more recent relevant papers check out the personal web pages of the staff members, accessible from the list below.
5. LSHTM researchers involved in developing or using time series regression methodology
Antonio Gasparrini; Ben Armstrong; Clarence Tam; Jamie Lopez Bernal; Katherine Arbuthnott; Krishnan Bhaskaran; Mike Kenward; Mylene Lagarde; Paul Wilkinson; Punam Mangtani; Rebecca Steinbach; Sam Pattenden; Sari Kovats; Shakoor Hajat ; Zaid Chalabi
6. Publications by LSHTM researchers
Armstrong B. 2006. Models for the relationship between ambient temperature and daily mortality. Epidemiology 17:624631.
Armstrong B, Chalabi Z, Fenn B, Hajat S, Kovats RS, Milojevic A, et al. 2010. The association of mortality with high temperatures in a temperate climate: England and wales. J Epidemiol Community Health [Epub ahead of print].
Armstrong BG. 2003. Fixed factors that modify the effects of timevarying factors: Applying the caseonly approach. Epidemiology 14:467472.
Armstrong BG, Mangtani P, Fletcher A, Kovats S, McMichael A, Pattenden S, et al. 2004. Effect of influenza vaccination on excess deaths occurring during periods of high circulation of influenza: Cohort study in elderly people. BMJ 329:660.
Armstrong BG, Gasparrini A, Tobias A. 2014. Conditional poisson models: A flexible alternative to conditional logistic case crossover analysis. BMC medical research methodology 14:122.
BaroneAdesi F, Gasparrini A, Vizzini L, Merletti F, Richiardi L. 2011. Effects of italian smoking regulation on rates of hospital admission for acute coronary events: A countrywide study. PLoS One 6:e17419.
*Bhaskaran K, Hajat S, Haines A, Herrett E, Wilkinson P, Smeeth L. 2010. Short term effects of temperature on risk of myocardial infarction in england and wales: Time series regression analysis of the myocardial ischaemia national audit project (minap) registry. British Medical Journal 341:c3823.
*Bhaskaran K, Hajat S, Armstrong B, Haines A, Herrett E, Wilkinson P, et al. 2011. The effects of hourly differences in air pollution on the risk of myocardial infarction: Case crossover analysis of the minap database. BMJ 343:d5531.
*Bhaskaran K, Armstrong B, Hajat S, Haines A, Wilkinson P, Smeeth L. 2012. Heat and risk of myocardial infarction: Hourly level casecrossover analysis of minap database. BMJ 345:e8050.
*Bhaskaran K, Gasparrini A, Hajat S, Smeeth L, Armstrong B. 2013. Time series regression studies in environmental epidemiology. International journal of epidemiology 42:11871195.
Gasparrini A, Gorini G, Barchielli A. 2009. On the relationship between smoking bans and incidence of acute myocardial infarction. European Journal of Epidemiology 24:597602.
*Gasparrini A, Armstrong B. 2010. Time series analysis on the health effects of temperature: Advancements and limitations. Environmental Research.
*Gasparrini A, Armstrong B, Kenward M. 2010. Distributed lag nonlinear models. Statistics in Medicine 29(21): 222434.
*Gasparrini A. 2011. Distributed lag linear and nonlinear models in r: The package dlnm. Journal of Statistical Software 43:120.
*Gasparrini A, Armstrong B. 2011. The impact of heat waves on mortality. Epidemiology 22:68.
*Gasparrini A, Armstrong B, Kenward MG. 2011. Multivariate metaanalysis: A method to summarize nonlinear associations. Statistics in Medicine 30:2504–2506.
*Gasparrini A, Armstrong B, Kenward MG. 2012a. Multivariate metaanalysis for nonlinear and other multiparameter associations. Statistics in Medicine 31:38213839.
*Gasparrini A, Armstrong B, Kovats S, Wilkinson P. 2012b. The effect of high temperatures on causespecific mortality in england and wales. Occupational and Environmental Medicine 69:5661.
Gasparrini A, Leone M. 2014. Attributable risk from distributed lag models. BMC Medical Research Methodology 14:55.
Gasparrini A, Guo Y, Hashizume M, Lavigne E, Zanobetti A, Schwartz J, et al. 2015. Mortality risk attributable to high and low ambient temperature: A multicountry observational study. The Lancet. In Press
*Gouveia N, Fletcher T. 2000. Time series analysis of air pollution and mortality: Effects by cause, age and socioeconomic status. Journal of epidemiology and community health 54:750.
Grundy C, Steinbach R, Edwards P, Green J, Armstrong B, Wilkinson P. 2009. Effect of 20 mph traffic speed zones on road injuries in london, 19862006: Controlled interrupted time series analysis. BMJ 339:b4469.
Hajat S, Haines A. 2002. Associations of cold temperatures with gp consultations for respiratory and cardiovascular disease amongst the elderly in london. Int J Epidemiol 31:825830.
Hajat S, Kovats RS, Atkinson RW, Haines A. 2002. Impact of hot temperatures on death in london: A time series approach. J Epidemiol Community Health 56:367372.
Hajat S, Armstrong BG, Gouveia N, Wilkinson P. 2005. Mortality displacement of heatrelated deaths: A comparison of delhi, sao paulo, and london. Epidemiology 16:613620.
Hajat S, Armstrong B, Baccini M, Biggeri A, Bisanti L, Russo A, et al. 2006. Impact of high temperatures on mortality: Is there an added heat wave effect? Epidemiology 17:632638.
Hajat S, Kovats RS, Lachowycz K. 2007. Heatrelated and coldrelated deaths in england and wales: Who is at risk? Occup Environ Med 64:93100.
Hajat S, Kosatky T. 2010. Heatrelated mortality: A review and exploration of heterogeneity. Journal of Epidemiology and Community Health 64:753760.
Hajat S, Vardoulakis S, Heaviside C, Eggen B. 2014. Climate change effects on human health: Projections of temperaturerelated mortality for the uk during the 2020s, 2050s and 2080s. Journal of epidemiology and community health 68:641648.
*Hashizume M, Armstrong B, Hajat S, Wagatsuma Y, Faruque AS, Hayashi T, et al. 2007. Association between climate variability and hospital visits for noncholera diarrhoea in bangladesh: Effects and vulnerable groups. Int J Epidemiol 36:10301037.
*Hashizume M, Armstrong B, Hajat S, Wagatsuma Y, Faruque AS, Hayashi T, et al. 2008a. The effect of rainfall on the incidence of cholera in bangladesh. Epidemiology 19:103110.
*Hashizume M, Wagatsuma Y, Faruque AS, Hayashi T, Hunter PR, Armstrong B, et al. 2008b. Factors determining vulnerability to diarrhoea during and after severe floods in bangladesh. J Water Health 6:323332.
*Hashizume M, Faruque ASG, Wagatsuma Y, Hayashi T, Armstrong B. 2010. Cholera in bangladesh: Climatic components of seasonal variation. Epidemiology 21:706710.
Imai C, Armstrong B, Chalabi Z, Hashizume M, Mangtani P. 2014. Application of traditional time series regression models for study of environmental determinants of infectious diseases. In: ISEE.
Kovats RS, Edwards SJ, Hajat S, Armstrong BG, Ebi KL, Menne B. 2004. The effect of temperature on food poisoning: A timeseries analysis of salmonellosis in ten european countries. Epidemiol Infect 132:443453.
Lagarde M. 2011. How to do (or not to do)… assessing the impact of a policy change with routine longitudinal data. Health policy and planning:czr004.
*Lee SJ, Hajat S, Steer PJ, Filippi V. 2008. A timeseries analysis of any shortterm effects of meteorological and air pollution factors on preterm births in london, uk. Environ Res 106:185194.
*Lopez Bernal JA, Gasparrini A, Artundo CM, McKee M. 2013. The effect of the late 2000s financial crisis on suicides in spain: An interrupted timeseries analysis. European Journal of Public Health 23:732736.
Lopman B, Armstrong B, Atchison C, Gray JJ. 2009. Host, weather and virological factors drive norovirus epidemiology: Timeseries analysis of laboratory surveillance data in england and wales. PLoS One 4:e6671.
Mangtani P, Hajat S, Kovats S, Wilkinson P, Armstrong B. 2006. The association of respiratory syncytial virus infection and influenza with emergency admissions for respiratory disease in london: An analysis of routine surveillance data. Clin Infect Dis 42:640646.
*McGuinn L, Hajat S, Wilkinson P, Armstrong B, Anderson HR, Monk V, et al. 2012. Ambient temperature and activation of implantable cardioverter defibrillators. Int J Biometeorol.
McMichael AJ, Wilkinson P, Kovats RS, Pattenden S, Hajat S, Armstrong B, et al. 2008. International study of temperature, heat and urban mortality: The ‘isothurm’ project. Int J Epidemiol.
Milojevic A, Armstrong B, Kovats S, Butler B, Hayes E, Leonardi G, et al. 2011. Longterm effects of flooding on mortality in england and wales, 19942005: Controlled interrupted timeseries analysis. Environ Health 10:11.
Milojevic A, Armstrong B, Hashizume M, McAllister K, Faruque A, Yunus M, et al. 2012. Health effects of flooding in rural bangladesh. Epidemiology 23:107115.
Milojevic A, Wilkinson P, Armstrong B, Bhaskaran K, Smeeth L, Hajat S. 2014. Shortterm effects of air pollution on a range of cardiovascular events in england and wales: Casecrossover analysis of the minap database, hospital admissions and mortality. Heart:heartjnl2013304963.
Pattenden S, Armstrong B, Milojevic A, Barratt B, Chalabi Z, Doherty R, et al. 2010. Ozone, heat and mortality in fifteen british conurbations. Occup Environ Med.
*Pudpong N, Hajat S. 2011. High temperature effects on outpatient visits and hospital admissions in chiang mai, thailand. Science of the Total Environment 409:52605267.
*Rehill N, Armstrong B, Wilkinson P. 2015. Clarifying life lost due to cold and heat: A new approach. BMJ Open In Press.
*Steinbach R, Edwards P, Green J, Armstrong B. 2014. The contribution of light levels to ethnic differences in child pedestrian injury risk: A caseonly analysis. Journal of Transport & Health 1:3339.
*Tam CC, Rodrigues LC, O’Brien SJ, Hajat S. 2006. Temperature dependence of reported campylobacter infection in england, 19891999. Epidemiol Infect 134:119125.
Tobías A, Armstrong B, Gasparrini A, Diaz J. 2014. Effects of high summer temperatures on mortality in 50 spanish cities. Environmental Health 13:48.
Vardoulakis S, Dear K, Hajat S, Heaviside C, Eggen B. 2014. Comparative assessment of the effects of climate change on heat and coldrelated mortality in the united kingdom and australia. Environmental Health Perspectives 122:1285–1292.
Wilkinson P, Pattenden S, Armstrong B, Fletcher A, Kovats RS, Mangtani P, et al. 2004. Vulnerability to winter mortality in elderly people in britain: Population based study. Bmj 329:647.
*Wolf J, Armstrong B. 2012. The association of season and temperature with adverse pregnancy outcome in two german states, a timeseries analysis. PLoS One 7:e40228.
* Research undertaken while the first author was a student at the LSHTM.
Last updated 1 April 2015. For more up to date publications refer to researchers’ personal web pages
7. Key references on methods
General
Bhaskaran K, Gasparrini A, Hajat S, Smeeth L, Armstrong B. Time series regression studies in environmental epidemiology. International Journal of Epidemiology. 2013;42(4):11871195
Peng, R. D. and F. Dominici (2008). Statistical Methods for Environmental Epidemiology with R – A Case Study in Air Pollutioon and Health. New York, Springer.
Zeger, S. L., R. Irizarry and R. D. Peng (2006). On time series analysis of public health and biomedical data. Annual Review of Public Health 27: 5779.
Armstrong, B. (2006). Models for the relationship between ambient temperature and daily mortality. Epidemiology 17(6): 62431.
Dominici, F. (2004). Timeseries analysis of air pollution and mortality: a statistical review. Research report – Health Effects Institute 123: 327; discussion 933.
Dominici, F., A. McDermott and T. J. Hastie (2004). Improved semiparametric time series models of air pollution and mortality. Journal of the American Statistical Association 99(468): 93849.
Touloumi, G., R. Atkinson, A. Le Tertre, et al. (2004). Analysis of health outcome time series data in epidemiological studies. EnvironMetrics 15(2): 10117.
On model selection
Dominici, F., C. Wang, C. Crainiceanu, et al. (2008). Model selection and health effect estimation in environmental epidemiology. Epidemiology 19(4): 55860.
Crainiceanu, C. M., F. Dominici and G. Parmigiani (2008). Adjustment uncertainty in effect estimation. Biometrika 95(3): 635.
Baccini, M., A. Biggeri, C. Lagazio, et al. (2007). Parametric and semiparametric approaches in the analysis of shortterm effects of air pollution on health. Computational Statistics and Data Analysis 51(9): 432436.
He, S., S. Mazumdar and V. C. Arena (2006). A comparative study of the use of GAM and GLM in air pollution research. EnvironMetrics 17(1): 8193.
Peng, R. D., F. Dominici and T. A. Louis (2006). Model choice in time series studies of air pollution and mortality. Journal of the Royal Statistical Society: Series A 169(2): 179203.
On smoothing methods
Marra, G. and R. Radice (2010). Penalised regression splines: theory and application to medical research. Statistical Methods in Medical Research 19(2): 10725.
Schimek, M. G. (2009). Semiparametric penalized generalized additive models for environmental research and epidemiology. EnvironMetrics 20(6): 699717.
Wood, S. N. (2006). Generalized Additive Models: an Introduction with R, Chapman \& Hall/CRC.
Dominici, F., M. J. Daniels, S. L. Zeger, et al. (2002a). Air pollution and mortality: estimating regional and national doseresponse relationships. Journal of the American Statistical Association 97: 10011.
Dominici, F., A. McDermott, S. L. Zeger, et al. (2002b). On the use of generalized additive models in timeseries studies of air pollution and health. American Journal of Epidemiology 156(3): 193203.
On harvesting effect
Rabl, A. (2005). Air pollution mortality: harvesting and loss of life expectancy. Journal of Toxicology and Environmental Health: Part A 68(1314): 117580.
Schwartz, J. (2001). Is there harvesting in the association of airborne particles with daily deaths and hospital admissions? Epidemiology 12(1): 5561.
Schwartz, J. (2000b). Harvesting and long term exposure effects in the relation between air pollution and mortality. American Journal of Epidemiology 151(5): 4408.
On distributed lag (nonlinear) models
Gasparrini A. Modeling exposurelagresponse associations with distributed lag nonlinear models. Statistics in Medicine. 2014;33(5):881899
Gasparrini, A. (2011). Distributed Lag Linear and NonLinear Models in R: The Package dlnm. J Stat Softw 43(8): 120.
Gasparrini, A., B. Armstrong and M. G. Kenward (2010). Distributed lag nonlinear models. Statistics in Medicine 29(21): 222434.
Muggeo, V. M. (2008). Modeling temperature effects on mortality: multiple segmented relationships with common break points. Biostatistics 9(4): 61320.
Schwartz, J. (2000a). The distributed lag between air pollution and daily deaths. Epidemiology 11(3): 3206.
On metaanalytic techniques
Gasparrini, A., B. Armstrong, et al. (2012). Multivariate metaanalysis for nonlinear and other multiparameter associations. Statistics in Medicine. 31:38213839.
Dominici, F., J. M. Samet and S. L. Zeger (2000). Combining evidence on air pollution and daily mortality from the 20 largest US cities: a hierarchical modelling strategy. Journal of the Royal Statistical Society: Series A 163(3): 263302.
Schwartz, J. and A. Zanobetti (2000). Using metasmoothing to estimate doseresponse trends across multiple studies, with application to air pollution and daily death. Epidemiology 11(6): 66672.
On interrupted time series
Wagner, A. K., S. B. Soumerai, F. Zhang, et al. (2002). Segmented regression analysis of interrupted time series studies in medication use research. Journal of Clinical Pharmacy and Therapeutics 27(4): 299309.
Shadish WR, Cook TD, et al. (2002) Experimental and quasiexperimental designs for generalized causal inference