Machine learning brings an early diagnostic for pancreatic cancer a step closer to reality

New study demonstrates the possibility of an effective new screening tool for high risk patients
Dr. Ananya Malhotra quote card

Individuals at higher risk of developing pancreatic cancer could be identified earlier using machine learning (ML) techniques which would result in a greater number of patients surviving the disease, suggests a new study published in PLOS ONE.

The study was led by the London School of Hygiene & Tropical Medicine (LSHTM) and funded by the UK charity Pancreatic Cancer Research Fund (PCRF). It used UK electronic health records for more than 1,000 patients aged 15-99 years who were diagnosed with pancreatic cancer between January 2005 and June 2009.

The researchers examined numerous symptoms and health statuses recorded by a GP among patients up to two years before the cancer diagnosis. They then developed an algorithm which ‘learnt’ how to distinguish patients who went on to develop pancreatic cancer from those who didn’t. The algorithm was then used to identify those at high risk of developing pancreatic cancer just from GP records.

Using this technique, 41% of patients under the age of 60 were identified as high risk, up to 20 months prior to diagnosis. Over 72% of people who went on to be diagnosed would have been successfully identified as high risk (sensitivity) whilst 59% of people who did not develop cancer were correctly identified as low risk (specificity). Results were similar for patients over 60, with 43% identified at 17 months, with 65% sensitivity and 57% specificity.

The team estimates that combining their algorithm with simple blood and urine tests which could potentially detect pancreatic cancer, currently under investigation, could result in 30 older and 400 younger patients per cancer being identified as ‘potential patients’. This could lead to the earlier diagnosis of around 60% of all pancreatic cancer tumours.

The authors acknowledge that further work is required to confirm, refine and evaluate the potential use of these findings in practice.

Dr Ananya Malhotra, co-lead author from the London School of Hygiene & Tropical Medicine, said: “Each year, 460,000 people worldwide are diagnosed with pancreatic cancer, and only around 5% of those diagnosed survive for five years or more. This low survival is because patients are usually diagnosed very late. Recent progress has been made in identifying biomarkers in the blood and urine, but these tests cannot be used for population screening as they would be very expensive and potentially harmful due to the psychological distress of excess testing.

“Although preliminary, this study offers some hope for a new early diagnosis for pancreatic cancer which until now remains elusive.”

Previous research has highlighted conditions associated with pancreatic cancer diagnosis such as jaundice, abdominal pain and new-onset diabetes. Whilst these new results are consistent with these findings, this approach is a step-change from these previous studies because the team examined whether it is possible to predict future pancreatic cancer based on the presence of a combination of symptoms or abnormalities more than 12 months before diagnosis, ignoring late-stage symptoms.

The case-control study used anonymised electronic health records from primary care linked to cancer registrations. Cases were comprised of 1,139 patients, aged 15-99 years, diagnosed with pancreatic cancer between January 2005 and June 2009. Each case was age-, sex- and diagnosis time-matched to four non-pancreatic (cancer patient) controls. Disease and prescription codes for the 24 months prior to diagnosis were used to identify 57 individual symptoms, with models then trained to predict patients who later developed pancreatic cancer.

The algorithm’s greatest potential is within a multiple-testing model where pancreatic cancer is one of several malignancies of interest. Another important finding was the relative importance of diabetes, over time-varying symptoms, in predicting later pancreatic cancer diagnosis, which is consistent with previous research.

Dr Laura Woods, study senior author from the London School of Hygiene & Tropical Medicine, said: “Using machine learning techniques we developed a risk score for pancreatic cancer diagnosis in order to identify patients for whom biomarkers might detect the disease at an early and treatable stage. After further work this approach could be applied in the primary care setting and has the potential to be used alongside a non-invasive biomarker test to increase earlier diagnosis. This would result in a greater number of patients surviving this devastating disease.”

Maggie Blanks, Pancreatic Cancer Research Fund’s Chief Executive Officer, said: “Using machine learning to help improve earlier diagnosis is truly novel and we’re extremely pleased that this pilot study has shown to have strong potential. We’re looking forward to seeing where this research leads, as earlier diagnosis will be a game-changer for improving survival for patients.”

The authors acknowledge limitations of the study including the poor specificity of the models arising principally from the use of cancer patients as controls that are not representative of the general population.

The research team is seeking further funding to develop this pilot study into a full investigation.


Ananya Malhotra, Bernard Rachet, Audrey Bonaventure, Stephen P Pereira, Laura M Woods. Can we screen for pancreatic cancer? Identifying a sub-population of patients at high risk of subsequent diagnosis using machine learning techniques applied to primary care data. PLOS ONE. DOI:10.1371/journal.pone.0251876

Short Courses

LSHTM's short courses provide opportunities to study specialised topics across a broad range of public and global health fields. From AMR to vaccines, travel medicine to clinical trials, and modelling to malaria, refresh your skills and join one of our short courses today.