Improving Uncertainty-Error Correspondence in Deep Bayesian Medical Image Segmentation

Prerak Mody1,2,3Orcid, Nicolas F. Chaves-de-Plaza4Orcid, Chinmay Rao1Orcid, Eleftheria Astrenidou5, Mischa de Ridder6Orcid, Nienke Hoekstra5Orcid, Klaus Hildebrandt4Orcid, Marius Staring1,5Orcid
1: Department of Radiology, Leiden University Medical Center, Leiden, The Netherlands, 2: HollandPTC consortium – Erasmus Medical Center, Rotterdam, Holland Proton Therapy Centre, Delft, Leiden, 3: University Medical Center, Leiden and TU Delft, Delft, The Netherlands, 4: Computer Graphics and Visualization Group , EEMCS, TU Delft, Delft, The Netherlands, 5: Department of Radiation Oncology, Leiden University Medical Center, Leiden, The Netherlands, 6: Department of Radiation Oncology, University Medical Center, Utrecht, The Netherlands
Publication date: 2024/08/31
https://doi.org/10.59275/j.melba.2024-5gc8
PDF · Code · arXiv

Abstract

Increased usage of automated tools like deep learning in medical image segmentation has alleviated the bottleneck of manual contouring. This has shifted manual labour to quality assessment (QA) of automated contours which involves detecting errors and correcting them. A potential solution to semi-automated QA is to use deep Bayesian uncertainty to recommend potentially erroneous regions, thus reducing time spent on error detection. Previous work has investigated the correspondence between uncertainty and error, however, no work has been done on improving the “utility” of Bayesian uncertainty maps such that it is only present in inaccurate regions and not in the accurate ones. Our work trains the FlipOut model with the Accuracy-vs-Uncertainty (AvU) loss which promotes uncertainty to be present only in inaccurate regions. We apply this method on datasets of two radiotherapy body sites, c.f. head-and-neck CT and prostate MR scans. Uncertainty heatmaps (i.e. predictive entropy) are evaluated against voxel inaccuracies using Receiver Operating Characteristic (ROC) and Precision-Recall (PR) curves. Numerical results show that when compared to the Bayesian baseline the proposed method successfully suppresses uncertainty for accurate voxels, with similar presence of uncertainty for inaccurate voxels. Code to reproduce experiments is available at https://github.com/prerakmody/bayesuncertainty-error-correspondence

Keywords

Bayesian Deep Learning · Bayesian Uncertainty · Uncertainty-Error Correspondence · Uncertainty Calibration · Contour Quality Assessment · Model Calibration

Bibtex @article{melba:2024:018:mody, title = "Improving Uncertainty-Error Correspondence in Deep Bayesian Medical Image Segmentation", author = "Mody, Prerak and Chaves-de-Plaza, Nicolas F. and Rao, Chinmay and Astrenidou, Eleftheria and de Ridder, Mischa and Hoekstra, Nienke and Hildebrandt, Klaus and Staring, Marius", journal = "Machine Learning for Biomedical Imaging", volume = "2", issue = "August 2024 issue", year = "2024", pages = "1048--1082", issn = "2766-905X", doi = "https://doi.org/10.59275/j.melba.2024-5gc8", url = "https://melba-journal.org/2024:018" }
RISTY - JOUR AU - Mody, Prerak AU - Chaves-de-Plaza, Nicolas F. AU - Rao, Chinmay AU - Astrenidou, Eleftheria AU - de Ridder, Mischa AU - Hoekstra, Nienke AU - Hildebrandt, Klaus AU - Staring, Marius PY - 2024 TI - Improving Uncertainty-Error Correspondence in Deep Bayesian Medical Image Segmentation T2 - Machine Learning for Biomedical Imaging VL - 2 IS - August 2024 issue SP - 1048 EP - 1082 SN - 2766-905X DO - https://doi.org/10.59275/j.melba.2024-5gc8 UR - https://melba-journal.org/2024:018 ER -

2024:018 cover