Counterfactual Explanations for Medical Image Classification and Regression using Diffusion Autoencoder

Matan Atad1,2Orcid, David Schinz1Orcid, Hendrik Moeller1,2Orcid, Robert Graf1,2Orcid, Benedikt Wiestler1,3Orcid, Daniel Rueckert2Orcid, Nassir Navab4Orcid, Jan S. Kirschke1Orcid, Matthias Keicher4Orcid
1: Department of Diagnostic and Interventional Neuroradiology, Klinikum rechts der Isar, Technical University of Munich, Germany, 2: Institute for Artificial Intelligence and Computer Science in Medicine, Technical University of Munich, Germany, 3: AI for Image-Guided Diagnosis and Therapy, Technical University of Munich, Germany, 4: Computer Aided Medical Procedures, Technical University of Munich, Germany
Publication date: 2024/09/30
https://doi.org/10.59275/j.melba.2024-4862
PDF · Code

Abstract

Counterfactual explanations (CEs) aim to enhance the interpretability of machine learning models by illustrating how alterations in input features would affect the resulting predictions. Common CE approaches require an additional model and are typically constrained to binary counterfactuals. In contrast, we propose a novel method that operates directly on the latent space of a generative model, specifically a Diffusion Autoencoder (DAE). This approach offers inherent interpretability by enabling the generation of CEs and the continuous visualization of the model’s internal representation across decision boundaries. Our method leverages the DAE’s ability to encode images into a semantically rich latent space in an unsupervised manner, eliminating the need for labeled data or separate feature extraction models. We show that these latent representations are helpful for medical condition classification and the ordinal regression of severity pathologies, such as vertebral compression fractures (VCF) and diabetic retinopathy (DR). Beyond binary CEs, our method supports the visualization of ordinal CEs using a linear model, providing deeper insights into the model’s decision-making process and enhancing interpretability. Experiments across various medical imaging datasets demonstrate the method’s advantages in interpretability and versatility. The linear manifold of the DAE’s latent space allows for meaningful interpolation and manipulation, making it a powerful tool for exploring medical image properties. Our code is available at https://github.com/matanat/dae_counterfactual

Keywords

Counterfactual Explanations · Interpretability · Diffusion Model · Latent Space · Medical Imaging

Bibtex @article{melba:2024:024:atad, title = "Counterfactual Explanations for Medical Image Classification and Regression using Diffusion Autoencoder", author = "Atad, Matan and Schinz, David and Moeller, Hendrik and Graf, Robert and Wiestler, Benedikt and Rueckert, Daniel and Navab, Nassir and Kirschke, Jan S. and Keicher, Matthias", journal = "Machine Learning for Biomedical Imaging", volume = "2", issue = "iMIMIC 2023 special issue", year = "2024", pages = "2103--2125", issn = "2766-905X", doi = "https://doi.org/10.59275/j.melba.2024-4862", url = "https://melba-journal.org/2024:024" }
RISTY - JOUR AU - Atad, Matan AU - Schinz, David AU - Moeller, Hendrik AU - Graf, Robert AU - Wiestler, Benedikt AU - Rueckert, Daniel AU - Navab, Nassir AU - Kirschke, Jan S. AU - Keicher, Matthias PY - 2024 TI - Counterfactual Explanations for Medical Image Classification and Regression using Diffusion Autoencoder T2 - Machine Learning for Biomedical Imaging VL - 2 IS - iMIMIC 2023 special issue SP - 2103 EP - 2125 SN - 2766-905X DO - https://doi.org/10.59275/j.melba.2024-4862 UR - https://melba-journal.org/2024:024 ER -

2024:024 cover