Abstract

A multi-institutional observer study of AI-aided oncologists' performance on treatment response assessment of bladder cancer.

Author
person Monika Joshi Penn State Cancer Institute, Hershey, PA info_outline Monika Joshi, Di Sun, Ajjai Shivaram Alva, Philip Palmbos, Yousef Zakharia, Rohan Garje, Heang-Ping Chan, Richard H. Cohan, Lubomir Hadjiiski
Full text
Authors person Monika Joshi Penn State Cancer Institute, Hershey, PA info_outline Monika Joshi, Di Sun, Ajjai Shivaram Alva, Philip Palmbos, Yousef Zakharia, Rohan Garje, Heang-Ping Chan, Richard H. Cohan, Lubomir Hadjiiski Organizations Penn State Cancer Institute, Hershey, PA, University of Michigan, Ann Arbor, MI, Department of Internal Medicine, Division of Hematology and Oncology, University of Michigan, Ann Arbor, MI, University of Iowa Holden Comprehensive Cancer Center, Iowa City, IA, Department of Internal Medicine, Division of Hematology/Oncology, University of Iowa, Iowa City, IA, University of Michigan Medical School, Ann Arbor, MI Abstract Disclosures Research Funding No funding sources reported Background: Neoadjuvant chemotherapy (NAC) before radical cystectomy (RC) for muscle-invasive bladder cancer (MIBC) results in a complete response of 30-40%. The remaining patients who do not benefit from this approach are at risk for disease progression prior to RC. Prompt accurate treatment response assessment (TRA) of NAC prior to RC is crucial to guide treatment and minimize toxicity. Our study evaluated the impact of artificial intelligence (AI)-based decision support system (CDSS-T) on oncologists’ performance in identifying patients who respond completely (stage T0) to NAC. Methods: We conducted a multicenter retrospective-observational study. The pathological cancer stage after NAC f/u RC served as the reference standard. CT Urograms (CTU) of 123 patients were collected, with a total of 157 evaluable imaging pairs including pre- and post-NAC CTU (N of T0=40). Five oncologists from 3 institutions participated. We randomly selected 51 CTU pairs for each oncologist, and each one read the set of 51 cases twice with a washout period of 3 weeks to eliminate reading memory. Each reading session included sequential reading without and then with the aid of CDSS-T. The CDSS-T was developed by combining radiomics and deep-learning AI predictions. Inter-observer performance was analyzed by the ROC iMRMC method, and intra-observer performance measured by Krippendorff’s alpha (agreement: α) and Bland–Altman (variability: standard deviation (SD)) methods. Results: With CDSS-T aid, the 5 oncologists had a higher mean area under the curve (AUC) and smaller SD for both first readings (0.77±0.08 without, 0.86±0.07 with CDSS-T) and second readings (0.77±0.09 without, 0.84±0.04 with CDSS-T). With aid they also had higher intra-observer agreement (α: 0.70±0.05 without, 0.84±0.04 with CDSS-T) and smaller intra-observer variability (SD: 26.28±2.69 without, 19.75±2.86 with CDSS-T). Conclusions: CDSS-T improves oncologists’ accuracy in TRA, increases intra-observer agreement, and reduces intra-observer variability. A prospective study is ongoing in MIBC to validate the findings. Reading 1 (AUC) Reading 2 (AUC) Intra-observer agreement α Intra-observer variability SD Oncologists w/o CDSS-T w/ CDSS-T w/o CDSS-T w/ CDSS-T w/o CDSS-T w/ CDSS-T w/o CDSS-T w/ CDSS-T #1 0.65 0.75 0.73 0.78 0.69 0.83 30.46 21.57 #2 0.81 0.85 0.82 0.87 0.64 0.80 26.08 19.07 #3 0.81 0.89 0.77 0.83 0.73 0.85 26.05 18.67 #4 0.73 0.88 0.64 0.83 0.69 0.89 22.92 16.01 #5 0.86 0.93 0.87 0.87 0.77 0.80 25.91 23.45 Mean±SD 0.77±0.08 0.86±0.07 0.77±0.09 0.84±0.04 0.70±0.05 0.84±0.04 26.28±2.69 19.75±2.86

7 organizations