Abstract

Promptable foundation model for automatic whole body RECIST measurement.

Author
person Paul Hérent Raidium, Paris, France info_outline Paul Hérent, Hélène Philippe, Elodie Ferreres, Leo Machado, Corentin Dancette, Julien Khlaut, Pierre Manceron, Daniel Tordjman
Full text
Authors person Paul Hérent Raidium, Paris, France info_outline Paul Hérent, Hélène Philippe, Elodie Ferreres, Leo Machado, Corentin Dancette, Julien Khlaut, Pierre Manceron, Daniel Tordjman Organizations Raidium, Paris, France, Raidium, Paris, Ile de France, France Abstract Disclosures Research Funding No funding sources reported Background: RECIST 1.1 is the gold standard for tumor evaluation in clinical trials and patient care. However, it faces challenges due to the subjective selection and measurement of lesions, notably because of inter-observer variability in the 20-30% range (1), which can potentially result in inaccuracies in therapeutic response classification (2). The main objective of this study is to assess the added value of a radiological foundation model to assist the radiologist to measure RECIST across different lesion topographies, using visual prompting. Methods: We use a promptable segmentation algorithm, based on a visual foundation model, pre-trained on various CT-scans and then trained on a subset of the Medical Segmentation Decathlon (MSD) dataset, covering 538 patients with pancreas, liver, colon and lung lesions. It outputs a 3D segmentation mask of the lesion, using a visual prompt, i.e. a 3D bounding box (BBox) as input. We then measure the capacity of the model to measure RECIST on the MSD validation set and on the KiTS validation set (99 samples), which features a new lesion type (kidney). To simulate inter-reader variability of up to 23%, we provide as inputs BBoxes centered on the lesions but with a varying error (15% variability: +10 pixels, 23% variability: +15 pixels). Results: The model is able to correct variability in BBox input: with all BBox error ranges, it beats the input variability (15% and 23% thresholds in each column) across many organ types, except in the pancreas when using large error BBoxes. Moreover, the model is able to generalize to new lesion types not seen during training, on an external dataset (KiTS - Kidney). Conclusions: Unlike existing supervised machine learning models dedicated to lesion detections in specific organs, our approach is organ and lesion agnostic and offers a more reliable, precise tumor assessment. This is a first step before allowing the longitudinal evaluation of tumors while safeguarding the clinical intuition necessary for selecting the right target lesions. Moreover, we believe that these methods will be crucial in advancing beyond RECIST 1.1, facilitating the identification of new prognostic biomarkers and proxies for tumor burden derived from previously underexplored radiomics features, ultimately refining the efficacy of clinical trial assessments and elevating the standard of patient care. 1. Yoon, S et al. 2015. 2. Kuhl et al. 2018. Dataset Median RECIST Measurement Error % (mm) 15% Variability in BBox 23% Variability in BBox MSD - Lung 10.9% (3.4) 8.9% (2.7) MSD - Liver 9.4% (1.6) 11.4% (2.4) MSD - Colon 2.78% (1.2) 4.15% (2.4) MSD - Pancreas 13.1% (3.9) 28.1% (8.9) KITS - Kidney 9.9% (3.5) 9.2% (3.1)

1 organization

Organization
Raidium