Abstract

Evaluating generalizability of landmark randomized controlled trials in common metastatic cancers using machine learning-based emulated trials.

Author
person Xavier Orcutt Navajo Indian Health Service, Chinle, AZ info_outline Xavier Orcutt, Qi Long, Ronac Mamtani, Ravi Bharat Parikh
Full text
Authors person Xavier Orcutt Navajo Indian Health Service, Chinle, AZ info_outline Xavier Orcutt, Qi Long, Ronac Mamtani, Ravi Bharat Parikh Organizations Navajo Indian Health Service, Chinle, AZ, University of Pennsylvania, Philadelphia, PA, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA Abstract Disclosures Research Funding No funding sources reported Background: Randomized controlled trial (RCT) results often lack generalizability to real-world oncology patients (pts). We present TrialTranslator, a machine learning (ML) framework to evaluate the generalizability of Phase III oncology RCTs by emulating trials across prognostic risk groups. Methods: TrialTranslator was applied to 11 landmark RCTs encompassing the most prevalent advanced cancers – non-small cell lung (aNSCLC), breast (mBC), prostate (mPC), and colorectal (mCRC). Flatiron Health’s EHR-derived de-identified database was used for model development and trial emulation. In Stage I, ML models (gradient boosted [GBM], random forest, support vector machine and penalized Cox) were trained to predict survival from time of metastatic diagnosis. Models used 130 longitudinal demographic, cancer-related, lab, and biomarker features. In Stage II, trial emulation, pts meeting RCT cancer and biomarker eligibility criteria were stratified into low, medium, and high-risk groups based on the top-performing ML model. We used inverse probability of treatment weighted survival analysis for each risk group to compare the median overall survival (mOS) or median progression free survival (mPFS) between experimental and control arms. Results: GBM models achieved the highest time-dependent area under the receiver operating curve (AUC) in the test set in all cancer types. For example, the 1-year AUC for the aNSCLC GBM vs the Lung Cancer Prognostic Index (validated Cox model) was 0.783 vs 0.689. ML-predicted low-risk pts typically had similar mOS or mPFS as RCT populations (Table). In emulated trials for low-risk pts, real-world survival benefit – the absolute difference in mOS or mPFS between experimental and control arms – was similar to RCT results. In contrast, for high-risk pts, survival benefits were not statistically or clinically significant. For example, in the FLAURA trial, RCT mPFS for experimental arm (osimertinib) vs control (gefitinib or erlotinib) was 18.9 mos vs 10.2 months; mPFS was 17.1 vs 10.3 months for low-risk pts and 5.1 vs 4.2 months for high-risk pts. Conclusions: ML frameworks like TrialTranslator may facilitate individual patient-level estimation of real-world treatment benefits of novel oncology agents. TrialTranslator is accessible at trialtranslator.com. mOS or mPFS in months for experimental vs control (difference) in select trials. Cancer Trial ML Risk Group in Emulated Trials Overall Real-World Cohort RCT Low Med High aNSCLC FLAURA 17.1 vs 10.3 (6.9) 12.2 vs 8.5 (3.7) 5.1 vs 4.2 (0.9) 10.7 vs 7.4 (3.3) 18.9 vs 10.2 (8.7) mBC PALOMA-2 28.3 vs 21.9 (6.4) 25.5 vs 18.4 (7.1) 11.9 vs 9.0 (2.9) 22.2 vs 14.8 (7.4) 24.8 vs 14.5 (10.3) mPC LATITUDE n/a vs 68.0 (n/a) 50.3 vs 40.7 (9.7) 22.1 vs 21.4 (0.8) 50.0 vs 38.2 (11.7) 53.3 vs 36.5 (16.8) mCRC FIRE-3 53.0 vs 45.6 (7.3) 29.8 vs 27.4 (2.3) 13.1 vs 12.8 (0.3) 31.8 vs 24.7 (7.0) 33.1 vs 25.6 (7.5)

1 organization

3 drugs

3 targets

Target
Erlotinib