Switch to low bandwidth version Close

Comparison of approaches for handling missing data in survival analysis

Dr Bernard Rachet and Dr Ula Nur

Population-based cancer registries are often forced to create new tumour registrations with incomplete data for which it may be difficult or impossible to obtain the required complementary information. Incomplete data are unavoidable in most research surveys, even if great efforts are made in planning and data collection. This difficulty is more prevalent in population-based routine data such as those collected by cancer registries. Restriction of analysis to records that are complete may yield inferences that are substantially different from those that would have been obtained had no data been missing. Ad hoc approaches such as complete-case analysis, mean-substitution or the use of a separate category for records with 'missing' data can all introduce bias and reduce the precision of estimation. To our knowledge, little work has been done on the impact of missing data on survival estimates, and none at all on relative survival with cancer data. A common type of missing data in cancer survival analysis arises when the true date of diagnosis cannot be identified, because the only available source of information is the death certificate. Since these records cannot be included in conventional survival analysis, the missing data can give rise to bias in survival estimation. We propose to apply the method of multiple imputation to the EUROCARE data in order to evaluate the impact of the routine exclusion of DCO records from relative survival estimation. Sensitivity analyses will be carried out to check the robustness of the results.

Back to top