Abstract

Deep learning to estimate RECIST in cancer patients treated in real-world settings.

Author

person Irbaz Bin Riaz Dana-Farber Cancer Institute, Boston, MA info_outline Irbaz Bin Riaz, Noman Ashraf, Gordon J Harris, Toni K. Choueiri, Kenneth L. Kehl

Source url

https://meetings.asco.org/abstracts-presentations/219524

Full text

Authors person Irbaz Bin Riaz Dana-Farber Cancer Institute, Boston, MA info_outline Irbaz Bin Riaz, Noman Ashraf, Gordon J Harris, Toni K. Choueiri, Kenneth L. Kehl Organizations Dana-Farber Cancer Institute, Boston, MA, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, Massachusetts General Hospital, Boston, MA Abstract Disclosures Research Funding No funding received None. Background: Creating large oncology clinical-genomic datasets is laborious and time-consuming. Deep learning approaches that extract RECIST outcomes at scale from observational electronic health records (EHR) data could tremendously facilitate precision oncology research. Methods: This retrospective study included patients with solid tumors treated on therapeutic clinical trials from 2004-2022 at Dana-Farber Cancer Institute with radiology reports in EHR and RECIST labels available from the tumor imaging metrics core. Each RECIST label was generated using corresponding radiology reports at a given time point. Patients were sampled into training, validation, and held-out test sets. A deep learning model (RECIST model) was trained to predict two outcomes: overall response and progressive disease at each time point, using reports from that time point and prior time points for each patient. This trained RECIST deep learning model was deployed on real-world radiology reports, and results were compared with true labels abstracted by trained human curators using the PRISSMM framework. Results: This study included 5153 patients with a total of 99,318 radiological reports who had RECIST annotations (median age at protocol enrollment [IQR] 60 years [52-57]; females 61% [3133]; and white 90% [4653]). The most common cancer types included breast (n=1006; 20%) lung (n=573; 11%), and ovarian cancer (n=539; 10%). The training subset included 4121 (79.9%) patients, the validation subset 518 (10.1%), and the held-out test set 514 (9.9%) patients. In the test set, the results showed an AUC of 0.86, and 0.87 with the best F1 scores of 0.72 and 0.63 for predicting overall response, and progressive disease, respectively. The real-world data set included 4482 patients. RECIST model evaluation on real-world radiology reports showed good performance for ascertaining PRISSMM annotations of progression from those reports (AUC 0.83, best F1 0.68), but poor performance for ascertaining PRISSMM annotations of response (AUC 0.63, best F1 0.28). Evaluation metrics are outlined in the Table. Conclusions: This study demonstrated the feasibility of using deep learning approaches to predict RECIST outcomes from radiology reports for patients with solid cancer tumors. The results show that the deep learning model accurately predicted RECIST labels on a held-out test set and real-world radiology reports for the outcome of progressive disease. These findings could accelerate precision oncology research providing a scalable way of ascertaining cancer outcomes from observational EHR data. Classification Evaluation Metric Validation subset Held-out test set Real world set Primary Study Overall response AUC 0.8526 0.8637 0.6326 Average precision score 0.74 0.74 0.15 Best F1 score 0.7138 0.7167 0.2811 Progressive disease AUC 0.8691 0.8671 0.8329 Average precision score 0.66 0.66 0.72 Best F1 score 0.6423 0.6267 0.6778

3 organizations

1 drug

Organization

Dana-Farber Cancer Institute

Organization

Harvard Medical School

Organization

Massachusetts General Hospital

Drug

RECIST