Abstract

A deep learning approach utilizing the electronic health record (EHR) to identify cancer recurrence in renal cell carcinoma (RCC)

Full text
BackgroundWhile longitudinal EHR data present a valuable resource for developing prognostic models for renal cell carcinoma (RCC), they lack well coded clinical outcomes including recurrence. This study harnesses this data to design a deep learning algorithm capable of accurately identifying recurrence information in individuals with localized RCC.MethodsWe curated codified and narrative EHR data including cancer characteristics extracted via natural language processing for a cohort of 28,715 patients with RCC diagnostic codes in Mass General Brigham. Gold standard labels on RCC diagnosis, recurrence timing and status were annotated for 350 patients via manual chart review. A semi-supervised label-efficient incident phenotyping deep learning algorithm (LATTE) was used to identify recurrence status and timing for eligible post-nephrectomy patients with non-metastatic RCC at diagnosis. RCC recurrence in EHR data identified via LATTE was validated against gold-standard labels and subsequently used to estimate recurrence rates stratified by AJCC TNM staging and Fuhrman grade.ResultsThe eligible cohort included 5,992 patients. RCC recurrence in EHR data identified via LATTE achieved C-statistic of 0.928 in validation against annotated recurrence. The estimated 5-year recurrence rates were 16.7% (table). We observe a substantially higher recurrence risk for T3 patients (49.7%) vs T1 (3.4%) or T2 (14.3%). Table: 1713P 5-yr recurrence Overall 16.7 (15.6 %-17.8 %) T1 3.4 (2.7 %-4.2 %) T2 14.3 (11.5 %-18.4 %) T3 49.7 (45.8 %-53.7 %) T3G1/G2 25.2 (17.7 %-32.5 %) T3G3 54.2 (47.1 %-61.5 %) T3G4 72.3 (64.4 %-80.8 %) T4 NR ConclusionsThe study was successful in generating a deep learning algorithm to identify RCC recurrence in large EHR data, substantially reducing the need for time and resource intensive chart reviews. As part of ongoing research efforts, the RCC recurrence in EHR data identified via LATTE will be further used to develop an accurate recurrence risk prediction model to identify localized RCC patents at higher risk of recurrence.Legal entity responsible for the studyMerck & Co., Inc.FundingMerck Sharp & Dohme LLC, a subsidiary of Merck & Co., Inc.DisclosureR. Bhattacharya: Financial Interests, Personal and Institutional, Stocks/Shares: Merck & Co., Inc.; Financial Interests, Personal, Full or part-time Employment: Merck & Co., Inc. A. Elfiky: Financial Interests, Personal, Advisory Board: McKinsey and Co.; Financial Interests, Personal, Full or part-time Employment: Merck & Co., Inc. V. Turzhitsky: Financial Interests, Personal, Stocks/Shares: Merck & Co., Inc.; Financial Interests, Personal, Full or part-time Employment: Merck & Co., Inc. R.R. McKay: Financial Interests, Personal, Advisory Board: Aveo, AstraZeneca, Bayer, Calithera, Dendreon, Exelixis, JNJ, Merck, Pfizer, Sanofi, Tempus, BMS, Lilly, Sanofi, Seagen, Telix, Sorrento Therapeutics, Eisai, Blue Earth Diagnostics; Financial Interests, Personal, Other, Molecular tumor board: Caris; Financial Interests, Personal, Other, Consultant: Dendreon, Myovant; Financial Interests, Personal, Invited Speaker: Exelixis, Merck, Novartis, Pfizer; Financial Interests, Steering Committee Member: Aveo, Merck, Eli Lilly; Financial Interests, Local PI: JNJ, Seagen, Calibr, Poseida, Oncternal, Scholar Rock, Merck, Xencor, Genentech, Eli Lilly, BMS; Financial Interests, Trial Chair: Bayer, AstraZeneca; Financial Interests, Institutional, Local PI: Exelixis. G. Jayram: Financial Interests, Personal, Financially compensated role, Consultant: Merck. M. Sundaram: Financial Interests, Personal, Full or part-time Employment: Merck & Co., Inc.; Financial Interests, Personal, Stocks or ownership: Merck & Co., Inc., AbbVie, Johnson & Johnson; Financial Interests, Personal, Stocks/Shares: Bristol Myers Squibb. All other authors have declared no conflicts of interest.