Abstract

Predicting real-world overall survival for post-operative pancreatic ductal adenocarcinoma: An ASCO CancerLinQ Discovery analysis.

Author
person Kaleem Sohail Ahmed Department of Surgery, University of Wisconsin - Madison, Madison, WI info_outline Kaleem Sohail Ahmed, Sheriff M Issaka, Clayton Thomas Marcinak, Syed Nabeel Zafar
Full text
Authors person Kaleem Sohail Ahmed Department of Surgery, University of Wisconsin - Madison, Madison, WI info_outline Kaleem Sohail Ahmed, Sheriff M Issaka, Clayton Thomas Marcinak, Syed Nabeel Zafar Organizations Department of Surgery, University of Wisconsin - Madison, Madison, WI, Department of Surgery, University of Wisconsin–Madison, Madison, WI, Division of Surgical Oncology, Department of Surgery, University of Wisconsin–Madison, Madison, WI Abstract Disclosures Research Funding NIH-NCI Background: Pancreatic ductal adenocarcinoma (PDAC) is the second leading cause of cancer-related deaths in the US. There is a need to improve risk-stratification of patients to facilitate treatment decisions. Novel datasets such as the American Society of Clinical Oncology (ASCO)’s CancerLinQ Discovery (CLQD) combined with machine learning (ML) methods provide an opportunity to generate new insights from “real-world data” (RWD) to study patient outcomes. The objective of this study was to develop an ML model using RWD to predict 1-year survival (1YS) for patients undergoing pancreatectomy for PDAC. Methods: Using the CLQD database, we identified patients diagnosed with PDAC who had undergone surgical resection between July 1998 and May 2021. Patients with complete data on procedure and tumor characteristics were included. Patients lost to follow-up within 12 months of surgery were excluded. We trained a multivariable Logistic Regression (LR) for baseline comparison. Gradient Boosting Classifier (GBC) and Random Forest (RF) models were then trained on demographic and clinico-pathologic data, with and without pre-operative laboratory (lab) data. GBC-1 and RF-1 were trained only on demographic and clinico-pathologic data (n=1440). GBC-2 and RF-2 were trained on a subset of patients who had non-missing lab data (n=669). We performed a one-way ANOVA to compare the area-under-the-curve (AUC) scores for each GBC and RF model to the corresponding LR. Results: Of the 1440 patients analyzed, 44.7% were male, 53.6% were White, and the mean age was 69.7 ± 10.9 years. Majority had a T3-stage tumor (53.7%) and 70.3% received either neoadjuvant, adjuvant, delayed and/or recurrent chemotherapy. Unadjusted median OS was 16.5 months [95% CI: 15.1-17.6], and 1YS was 60.8% [95% CI: 58.2-63.3]. GBC and RF models were able to outperform the baseline LRs (Table 1). RF-2 and LR-2 performed the best (AUC: 0.83 each) with lab data, followed by RF-1 (AUC: 0.81) without lab data. Factors that were most predictive of survival were: receipt of chemotherapy, tumor size, pre-operative BMI, age, and pre-operative neutrophil count, serum cholesterol, and serum CA-19-9 levels. Conclusions: In this study, we show that ML models using clinicopathologic, demographic, and laboratory data provide more accurate prediction of 1 year survival than prior strategies. Such models can be used to identify patients at high risk for poor outcomes following pancreatectomy and alternative treatments may be offered. Model performance with and without pre-operative laboratory data. Model Feature Set AUC Score LR-1 Full Dataset, Without Labs (n=1440) AUC = 0.78 RF-1 Full Dataset, Without Labs (n=1440) AUC = 0.81 GBC-1 Full Dataset, Without Labs (n=1440) AUC = 0.79 LR-2 Limited Dataset, With Labs (n=669) AUC = 0.83 RF-2 Limited Dataset, With Labs (n=669) AUC =0.83 GBC-2 Limited Dataset, With Labs (n=669) AUC = 0.82

2 organizations