Analysis of the BraTS 2023 Intracranial Meningioma Segmentation Challenge

Dominic LaBella1Orcid, Ujjwal Baid2,3, Omaditya Khanna4, Shan McBurney-Lin5, Ryan McLean6, Pierre Nedelec5, Arif Rashid7, Nourel Hoda Tahon8, Talissa Altes4, Radhika Bhalerao5, Yaseen Dhemesh8, Devon Godfrey1, Fathi Hilal8, Scott Floyd1, Anastasia Janas6, Anahita Fathi Kazerooni9,10, John Kirkpatrick1, Collin Kent1, Florian Kofler11,12,13,14, Kevin Leu15, Nazanin Maleki6, Bjoern Menze16,17, Maxence Pajot5, Zachary J. Reitman1, Jeffrey D. Rudie18,5, Rachit Saluja19, Yury Velichko20, Chunhao Wang1, Pranav Warman21, Maruf Adewole22, Jake Albrecht23, Udunna Anazodo24, Syed Muhammad Anwar25,26, Timothy Bergquist23, Sully Francis Chen21, Verena Chung23, Rong Chai23, Gian-Marco Conte27, Farouk Dako28, James Eddy23, Ivan Ezhov12,13, Nastaran Khalili9, Juan Eugenio Iglesias29,30,31, Zhifan Jiang25,26, Elaine Johanson32, Koen Van Leemput33, Hongwei Bran Li34,17,35, Marius George Linguraru25,26, Xinyang Liu25,26, Aria Mahtabfar4, Zeke Meier36, Ahmed W Moawad37, John Mongan5, Marie Piraud11, Russell Takeshi Shinohara38,10, Walter F. Wiggins39,40,41, Aly H. Abayazeed42, Rachel Akinola43, András Jakab44, Michel Bilello10,45, Maria Correia de Verdier46, Priscila Crivellaro47, Christos Davatzikos10,45, Keyvan Farahani48, John Freymann49,48, Christopher Hess5, Raymond Huang50, Philipp Lohmann51,52, Mana Moassefi53, Matthew W. Pease54, Phillipp Vollmuth55,56, Nico Sollmann57,58,59, David Diffley60, Khanak K. Nandolia61, Daniel I Warren62, Ali Hussain63, Pascal Fehringer64, Yulia Bronstein65, Lisa Deptula66, Evan G. Stein67, Mahsa Taherzadeh68, Eduardo Portela de Oliveira69, Aoife Haughey70, Marinos Kontzialis71, Luca Saba72, Benjamin Turner73, Melanie M. T. Brüßeler74, Shehbaz Ansari75, Athanasios Gkampenis76, David Maximilian Weiss77, Aya Mansour78, Islam H. Shawali79, Nikolay Yordanov80, Joel M. Stein45, Roula Hourani81, Mohammed Yahya Moshebah82, Ahmed Magdy Abouelatta83, Tanvir Rizvi84, Klara Willms6, Dann C. Martin85, Abdullah Okar86, Gennaro D’Anna87, Ahmed Taha88, Yasaman Sharifi89, Shahriar Faghani27, Dominic Kite90, Marco Pinho91, Muhammad Ammar Haider92, Alejandro Aristizabal93,94, Alexandros Karargyris93, Hasan Kassem93, Sarthak Pati95,2,96, Micah Sheller97,93, Michelle Alonso-Basanta7, Javier Villanueva-Meyer5, Andreas M Rauschecker5, Ayman Nada8, Mariam Aboian9, Adam E. Flanders98, Benedikt Wiestler17, Spyridon Bakas2,99,100,101Orcid, Evan Calabrese41,5
1: Duke University Medical Center, Department of Radiation Oncology, Durham, NC, USA, 2: Division of Computational Pathology, Department of Pathology and Laboratory Medicine, Indiana University School of Medicine, Indianapolis, IN, USA, 3: Center for Federated Learning in Medicine, Indiana University, Indianapolis, IN, USA, 4: Department of Neurosurgery, Thomas Jefferson University, Philadelphia, PA, USA, 5: University of California San Francisco, CA, USA, 6: Yale University, New Haven, CT, USA, 7: Department of Radiation Oncology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA, 8: University of Missouri, Columbia, MO, USA, 9: Children’s Hospital of Philadelphia, University of Pennsylvania, Philadelphia, PA, USA, 10: Center for AI and Data Science for Integrated Diagnostics (AI2D) and Center for Biomedical Image Computing and Analytics (CBICA), University of Pennsylvania, Philadelphia, PA, USA, 11: Helmholtz AI, Helmholtz Munich, Germany, 12: Department of Informatics, Technical University Munich, Germany, 13: TranslaTUM - Central Institute for Translational Cancer Research, Technical University of Munich, Germany, 14: Department of Diagnostic and Interventional Neuroradiology, School of Medicine, Klinikum rechts der Isar, Technical University of Munich, Germany, 15: Center for Intelligent Imaging (ci2), Department of Radiology and Biomedical Imaging, University of California San Francisco (UCSF), San Francisco, CA, USA, 16: Biomedical Image Analysis and Machine Learning, Department of Quantitative Biomedicine, University of Zurich, Switzerland, 17: Department of Neuroradiology, Technical University of Munich, Munich, Germany, 18: University of San Diego, CA, USA, 19: Cornell University, Ithaca, NY, USA, 20: Department of Radiology, Northwestern University, Evanston, IL, USA, 21: Duke University Medical Center, School of Medicine, Durham, NC, USA, 22: Medical Artificial Intelligence (MAI) Lab, Crestview Radiology, Lagos, Nigeria, 23: Sage Bionetworks, USA, 24: Montreal Neurological Institute (MNI), McGill University, Montreal, QC, Canada, 25: Children’s National Hospital, Washington DC, USA, 26: George Washington University, Washington DC, USA, 27: Mayo Clinic, Rochester, MN, USA, 28: Center for Global Health, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA, 29: Athinoula A Martinos Center for Biomedical Imaging, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA, 30: Centre for Medical Image Computing, University College London, London, UK, 31: Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA, 32: PrecisionFDA, U.S. Food and Drug Administration, Silver Spring, MD, USA, 33: Department of Applied Mathematics and Computer Science, Technical University of Denmark, Denmark, 34: Athinoula A Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Boston, MA, USA, 35: University of Zurich, Switzerland, 36: Booz Allen Hamilton, McLean, VA, USA, 37: Mercy Catholic Medical Center, Darby, PA, USA, 38: Penn Statistics in Imaging and Visualization Center, Department of Biostatistics, Epidemiology, and Informatics, 39: Greensboro Radiology, Greensboro, NC, USA, 40: Radiology Partners, El Segundo, CA, USA, 41: Duke University Medical Center, Department of Radiology, Durham, NC, USA, 42: Neosoma Inc. Stanford Medicine, Stanford, CA, USA, 43: Lagos University Teaching Hospital, Lagos Nigeria., 44: University of Zürich, Zürich,Switzerland, 45: Department of Radiology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA, 46: Department of Neuroradiology, Uppsala University, Sweden, 47: University of Toronto, Toronto, ON, Canada, 48: Cancer Imaging Program, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA, 49: Leidos Biomedical Research, Inc., Frederick National Laboratory for Cancer Research, Frederick, MD, USA, 50: Department of Radiology, Brigham and Women’s Hospital, Boston, MA, USA, 51: Institute of Neuroscience and Medicine (INM-4), Research Center Juelich, Juelich, Germany, 52: Department of Nuclear Medicine, University Hospital RWTH Aachen, Aachen, Germany, 53: Artificial Intelligence Lab, Department of Radiology, Mayo Clinic, Rochester, MN, USA, 54: Department of Neurosurgery, Indiana University, Indianapolis, IN, USA, 55: Department of Medical Image Computing, German Cancer Research Center (DKFZ), Heidelberg, Germany, 56: Department of Neuroradiology, University Hospital Bonn, Bonn Germany, 57: Department of Diagnostic and Interventional Radiology, University Hospital Ulm, Ulm, Germany, 58: Department of Diagnostic and Interventional Neuroradiology, School of Medicine, Klinikum rechts der Isar, Technical University of Munich, Munich, Germany, 59: TUM-Neuroimaging Center, Klinikum rechts der Isar, Technical University of Munich, Munich, Germany, 60: Fort Worth, TX, USA, 61: Department of Diagnostic and Interventional Radiology, All India Institute of Medical Sciences, Rishikesh, India, 62: Department of Neuroradiology, Washington University, St. Louis, MO, USA, 63: University of Rochester Medical Center, Rochester, NY, USA, 64: Faculty of Medicine, Jena University Hospital, Friedrich Schiller University Jena, Jena, Germany, 65: vRad (Radiology Partners), Minneapolis, MN, USA, 66: Ross University School of Medicine, Bridgetown, Barbados, 67: Department of Radiology, New York University Grossman School of Medicine, New York, NY, USA, 68: Department of Radiology, Arad Hospital, Tehran, Iran, 69: Department of Radiology, Faculty of Medicine, University of Ottawa, Cananda, 70: Department of Neuroradiology, JDMI, University of Toronto, TO, Canada, 71: Department of Radiology Northwestern University, Chicago, IL, 72: Department of Radiology, Azienda Ospedaliero Universitaria of Cagliari-Polo di Monserrato, Cagliari, Italy, 73: Department of Radiology, Leeds General Infirmary, Leeds, United Kingdom, 74: Ludwig Maximilians University, Munich, Bavaria, Germany, 75: Rush University Medical Center, Chicago, IL, USA, 76: Department of Neurosurgery, University Hospital of Ioannina, Ioannina, Greece, 77: Department of Neuroradiology, University Hospital Essen, Essen, North Rhine-Westphalia, Germany, 78: Egyptian Ministry of Health, Cairo, Egypt, 79: Department of Radiology, Kasr Alainy, Cairo University, Cairo, Egypt, 80: Faculty of Medicine, Medical University of Sofia, Sofia, Bulgaria, 81: Department of Radiology, American University of Beirut Medical center, Beirut, Lebanon, 82: Radiology Department, King Faisal Medical City, Abha, Saudi Arabia, 83: Department of Diagnostic and Interventional Radiology, Cairo University, Cairo, Egypt, 84: Department of Radiology and Medical Imaging, University of Virginia Health, Charlottesville, VA, USA, 85: Department of Radiology and Radiologic Sciences, Vanderbilt University Medical Center, TN, USA, 86: Faculty of Medicine, Hamburg University, Hamburg, Germany, 87: Neuroimaging Unit, ASST Ovest Milanese, Legnano, Milan, Italy, 88: University of Manitoba, Manitoba, Canada, 89: Department of Radiology, School of Medicine, Iran University of Medical Sciences, Iran, Tehran, 90: Department of Radiology, University Hospitals Bristol and Weston NHS Foundation Trust, Bristol, United Kingdom, 91: Department of Radiology, University of Texas Southwestern Medical Center, TX, USA, 92: CMH Lahore Medical College, Lahore, Pakistan, 93: MLCommons, 94: Factored AI, 95: Center For Federated Learning in Medicine, Indiana University, Indianapolis, IN, USA, 96: Medical Working Group, MLCommons, San Fransisco, CA, USA, 97: Intel, 98: Department of Radiology, Thomas Jefferson University, Philadelphia, PA, USA, 99: Department of Radiology and Imaging Sciences, Indiana University School of Medicine, Indianapolis, IN, USA, 100: Department of Neurological Surgery, Indiana University School of Medicine, Indianapolis, IN, USA, 101: Department of Biostatistics and Health Data Science, Indiana University School of Medicine, Indianapolis, IN, USA
Publication date: 2025/03/07
https://doi.org/10.59275/j.melba.2025-bea1
PDF · arXiv

Abstract

We describe the design and results from the BraTS 2023 Intracranial Meningioma Segmentation Challenge. The BraTS Meningioma Challenge differed from prior BraTS Glioma challenges in that it focused on meningiomas, which are typically benign extra-axial tumors with diverse radiologic and anatomical presentation and a propensity for multiplicity. Nine participating teams each developed deep-learning automated segmentation models using image data from the largest multi-institutional systematically expert annotated multilabel multi-sequence meningioma MRI dataset to date, which included 1000 training set cases, 141 validation set cases, and 283 hidden test set cases. Each case included T2, FLAIR, T1, and T1Gd brain MRI sequences with associated tumor compartment labels delineating enhancing tumor, non-enhancing tumor, and surrounding non-enhancing FLAIR hyperintensity. Participant automated segmentation models were evaluated and ranked based on a scoring system evaluating lesion-wise metrics including dice similarity coefficient (DSC) and 95% Hausdorff Distance. The top ranked team had a lesion-wise median dice similarity coefficient (DSC) of 0.976, 0.976, and 0.964 for enhancing tumor, tumor core, and whole tumor, respectively and a corresponding average DSC of 0.899, 0.904, and 0.871, respectively. These results serve as state-of-the-art benchmarks for future pre-operative meningioma automated segmentation algorithms. Additionally, we found that 1286 of 1424 cases (90.3%) had at least 1 compartment voxel abutting the edge of the skull-stripped image edge, which requires further investigation into optimal pre-processing face anonymization steps.

Keywords

Meningioma · BraTS · Machine Learning · Segmentation · BraTS-Meningioma · Image Analysis Challenge · artificial intelligence · AI

Bibtex @article{melba:2025:003:labella, title = "Analysis of the BraTS 2023 Intracranial Meningioma Segmentation Challenge", author = "LaBella, Dominic and Baid, Ujjwal and Khanna, Omaditya and McBurney-Lin, Shan and McLean, Ryan and Nedelec, Pierre and Rashid, Arif and Tahon, Nourel Hoda and Altes, Talissa and Bhalerao, Radhika and Dhemesh, Yaseen and Godfrey, Devon and Hilal, Fathi and Floyd, Scott and Janas, Anastasia and Kazerooni, Anahita Fathi and Kirkpatrick, John and Kent, Collin and Kofler, Florian and Leu, Kevin and Maleki, Nazanin and Menze, Bjoern and Pajot, Maxence and Reitman, Zachary J. and Rudie, Jeffrey D. and Saluja, Rachit and Velichko, Yury and Wang, Chunhao and Warman, Pranav and Adewole, Maruf and Albrecht, Jake and Anazodo, Udunna and Anwar, Syed Muhammad and Bergquist, Timothy and Chen, Sully Francis and Chung, Verena and Chai, Rong and Conte, Gian-Marco and Dako, Farouk and Eddy, James and Ezhov, Ivan and Khalili, Nastaran and Iglesias, Juan Eugenio and Jiang, Zhifan and Johanson, Elaine and Van Leemput, Koen and Li, Hongwei Bran and Linguraru, Marius George and Liu, Xinyang and Mahtabfar, Aria and Meier, Zeke and Moawad, Ahmed W and Mongan, John and Piraud, Marie and Shinohara, Russell Takeshi and Wiggins, Walter F. and Abayazeed, Aly H. and Akinola, Rachel and Jakab, András and Bilello, Michel and Correia de Verdier, Maria and Crivellaro, Priscila and Davatzikos, Christos and Farahani, Keyvan and Freymann, John and Hess, Christopher and Huang, Raymond and Lohmann, Philipp and Moassefi, Mana and Pease, Matthew W. and Vollmuth, Phillipp and Sollmann, Nico and Diffley, David and Nandolia, Khanak K. and Warren, Daniel I and Hussain, Ali and Fehringer, Pascal and Bronstein, Yulia and Deptula, Lisa and Stein, Evan G. and Taherzadeh, Mahsa and Portela de Oliveira, Eduardo and Haughey, Aoife and Kontzialis, Marinos and Saba, Luca and Turner, Benjamin and Brüßeler, Melanie M. T. and Ansari, Shehbaz and Gkampenis, Athanasios and Weiss, David Maximilian and Mansour, Aya and Shawali, Islam H. and Yordanov, Nikolay and Stein, Joel M. and Hourani, Roula and Moshebah, Mohammed Yahya and Abouelatta, Ahmed Magdy and Rizvi, Tanvir and Willms, Klara and Martin, Dann C. and Okar, Abdullah and D’Anna, Gennaro and Taha, Ahmed and Sharifi, Yasaman and Faghani, Shahriar and Kite, Dominic and Pinho, Marco and Haider, Muhammad Ammar and Aristizabal, Alejandro and Karargyris, Alexandros and Kassem, Hasan and Pati, Sarthak and Sheller, Micah and Alonso-Basanta, Michelle and Villanueva-Meyer, Javier and Rauschecker, Andreas M and Nada, Ayman and Aboian, Mariam and Flanders, Adam E. and Wiestler, Benedikt and Bakas, Spyridon and Calabrese, Evan", journal = "Machine Learning for Biomedical Imaging", volume = "3", issue = "March 2025 issue", year = "2025", pages = "38--58", issn = "2766-905X", doi = "https://doi.org/10.59275/j.melba.2025-bea1", url = "https://melba-journal.org/2025:003" }
RISTY - JOUR AU - LaBella, Dominic AU - Baid, Ujjwal AU - Khanna, Omaditya AU - McBurney-Lin, Shan AU - McLean, Ryan AU - Nedelec, Pierre AU - Rashid, Arif AU - Tahon, Nourel Hoda AU - Altes, Talissa AU - Bhalerao, Radhika AU - Dhemesh, Yaseen AU - Godfrey, Devon AU - Hilal, Fathi AU - Floyd, Scott AU - Janas, Anastasia AU - Kazerooni, Anahita Fathi AU - Kirkpatrick, John AU - Kent, Collin AU - Kofler, Florian AU - Leu, Kevin AU - Maleki, Nazanin AU - Menze, Bjoern AU - Pajot, Maxence AU - Reitman, Zachary J. AU - Rudie, Jeffrey D. AU - Saluja, Rachit AU - Velichko, Yury AU - Wang, Chunhao AU - Warman, Pranav AU - Adewole, Maruf AU - Albrecht, Jake AU - Anazodo, Udunna AU - Anwar, Syed Muhammad AU - Bergquist, Timothy AU - Chen, Sully Francis AU - Chung, Verena AU - Chai, Rong AU - Conte, Gian-Marco AU - Dako, Farouk AU - Eddy, James AU - Ezhov, Ivan AU - Khalili, Nastaran AU - Iglesias, Juan Eugenio AU - Jiang, Zhifan AU - Johanson, Elaine AU - Van Leemput, Koen AU - Li, Hongwei Bran AU - Linguraru, Marius George AU - Liu, Xinyang AU - Mahtabfar, Aria AU - Meier, Zeke AU - Moawad, Ahmed W AU - Mongan, John AU - Piraud, Marie AU - Shinohara, Russell Takeshi AU - Wiggins, Walter F. AU - Abayazeed, Aly H. AU - Akinola, Rachel AU - Jakab, András AU - Bilello, Michel AU - Correia de Verdier, Maria AU - Crivellaro, Priscila AU - Davatzikos, Christos AU - Farahani, Keyvan AU - Freymann, John AU - Hess, Christopher AU - Huang, Raymond AU - Lohmann, Philipp AU - Moassefi, Mana AU - Pease, Matthew W. AU - Vollmuth, Phillipp AU - Sollmann, Nico AU - Diffley, David AU - Nandolia, Khanak K. AU - Warren, Daniel I AU - Hussain, Ali AU - Fehringer, Pascal AU - Bronstein, Yulia AU - Deptula, Lisa AU - Stein, Evan G. AU - Taherzadeh, Mahsa AU - Portela de Oliveira, Eduardo AU - Haughey, Aoife AU - Kontzialis, Marinos AU - Saba, Luca AU - Turner, Benjamin AU - Brüßeler, Melanie M. T. AU - Ansari, Shehbaz AU - Gkampenis, Athanasios AU - Weiss, David Maximilian AU - Mansour, Aya AU - Shawali, Islam H. AU - Yordanov, Nikolay AU - Stein, Joel M. AU - Hourani, Roula AU - Moshebah, Mohammed Yahya AU - Abouelatta, Ahmed Magdy AU - Rizvi, Tanvir AU - Willms, Klara AU - Martin, Dann C. AU - Okar, Abdullah AU - D’Anna, Gennaro AU - Taha, Ahmed AU - Sharifi, Yasaman AU - Faghani, Shahriar AU - Kite, Dominic AU - Pinho, Marco AU - Haider, Muhammad Ammar AU - Aristizabal, Alejandro AU - Karargyris, Alexandros AU - Kassem, Hasan AU - Pati, Sarthak AU - Sheller, Micah AU - Alonso-Basanta, Michelle AU - Villanueva-Meyer, Javier AU - Rauschecker, Andreas M AU - Nada, Ayman AU - Aboian, Mariam AU - Flanders, Adam E. AU - Wiestler, Benedikt AU - Bakas, Spyridon AU - Calabrese, Evan PY - 2025 TI - Analysis of the BraTS 2023 Intracranial Meningioma Segmentation Challenge T2 - Machine Learning for Biomedical Imaging VL - 3 IS - March 2025 issue SP - 38 EP - 58 SN - 2766-905X DO - https://doi.org/10.59275/j.melba.2025-bea1 UR - https://melba-journal.org/2025:003 ER -

2025:003 cover

Disclaimer: the following html version has been automatically generated and the PDF remains the reference version. Feedback can be sent directly to publishing-editor@melba-journal.org

1 Introduction and Related Works

[LettrineLocal]lines=3, findent=0.5em, nindent=0em

Meningiomas are the most common primary brain tumors, and several common treatment modalities, including surgical resection and radiation therapy, require accurate delineation of tumor components (Ogasawara et al., 2021; Rogers et al., 2017, 2020). When used clinically, meningioma MRI segmentation is often performed using T1-weighted (T1), T2-weighted (T2), T2-weighted-Fluid-Attenuated Inversion Recovery (FLAIR), and T1 post-contrast (T1Gd) multi-sequence brain magnetic resonance image (MRI) (Martz et al., 2022). Meningioma segmentation on brain MRI can be challenging due to the diverse morphology and location of meningiomas within the brain. Anatomically, meningiomas arise from the arachnoid layer of the meninges between the dura mater and pia mater and commonly present at supratentorial sites of dural reflection, along the sphenoid sinus, and the skull base. Less commonly, meningiomas occur in intraventricular and suprasellar regions, the olfactory groove, and in the posterior fossa along the petrous bone (LaBella et al., 2023). Examples of common anatomical locations of meningioma are depicted in Figure 1, which is an unmodified figure by Murek (Murek, 2024). Their extra-axial location can frequently lead to their exclusion, in whole or in part, from brain MRI skull-stripping pre-processing steps. Radiographically, meningioma can present with a wide range of presentations which contributes to the difficulty in creating accurate generalizable meningioma automated segmentation models (Watts et al., 2014). Commonly encountered radiographical variants and findings include en plaque meningioma, which is a plaque-like sessile extension of tumor along the meninges, cystic meningioma components, dural tail involvement extension, peri-tumoral edema, and numerous distinct lesions (Watts et al., 2014).

Recent advancements in segmentation techniques for brain tumors, particularly gliomas, through the application of deep learning and convolutional neural networks (CNNs), have shown promise in overcoming these challenges, offering increased accuracy and reproducibility compared to traditional methods (Pereira et al., 2016; Havaei et al., 2017; Bouget et al., 2022). Since the inception of the Brain Tumor Segmentation (BraTS) challenge in 2012, they have been instrumental in propelling forward the field of brain tumor imaging segmentation by providing comprehensive datasets that facilitate the development and benchmarking of segmentation algorithms (Menze et al., 2014; Bakas et al., 2017). The inaugural 2012 challenge had 35 training cases and 15 test cases focused solely on glioma, and the glioma dataset most recently increased to over 2000 cases included in the 2023 challenge. Studies by Menze et al. and Bakas et al. underscore the importance of the BraTS dataset in improving the segmentation accuracy for gliomas, leveraging multi-sequence MR images to improve the delineation of tumor tissues from non-tumorous brain matter (Menze et al., 2014; Bakas et al., 2017; Gordillo et al., 2013; Işın et al., 2016).

In 2023, the BraTS organizing committee hosted new automated segmentation challenges to additionally focus on pediatric tumors, gliomas diagnosed in sub-Saharan Africa, brain metastasis, and meningioma (bra, 2023; LaBella et al., 2023; Baid et al., 2021; Kazerooni et al., 2023; Moawad et al., 2023). Building on the foundation of prior BraTS challenges, the BraTS 2023 Intracranial Meningioma Segmentation Challenge aims to establish a community standard and benchmark for intracranial meningioma segmentation (Bakas et al., 2017; LaBella et al., 2023; Calabrese and LaBella, 2023). We present a comprehensive analysis of segmentation performance across nine teams participating in the challenge focusing on key metrics: Enhancing tumor (ET) dice similarity coefficient (DSC), tumor core (TC) DSC, and whole tumor (WT) DSC, ET 95% Hausdorff Distance (95HD), TC 95HD, and WT 95HD. These metrics were evaluated on a lesion-wise basis to account for the possibility of multiple lesions. In many clinical scenarios, particularly in diseases such as meningioma where patients may present with multiple lesions of varying sizes, global metrics tend to average performance over the entire image volume. This averaging can mask suboptimal performance on smaller or less conspicuous lesions. By contrast, lesion‐wise evaluation allows each individual lesion to be assessed separately, thereby providing a more nuanced picture of an algorithm’s performance. For instance, a segmentation algorithm might achieve a high overall DSC by accurately segmenting larger lesions while missing or poorly delineating smaller ones. Evaluating the DSC and 95HD on a lesion-by-lesion basis highlights such discrepancies, which is particularly important for clinical decision-making where even a single missed lesion could be significant. Advantages and disadvantages of lesion-wise metrics are listed below.

Advantages of lesion-wise metrics:

  • Granular Assessment: Evaluates each lesion individually, revealing performance variability hidden in global metrics.

  • Clinical Relevance: Aligns with clinical needs by ensuring even small, critical lesions are accurately segmented.

  • Error Localization: Identifies specific algorithm weaknesses on a per-lesion basis.

Disadvantages of lesion-wise metrics:

  • Noise Sensitivity: Small errors in tiny lesions can disproportionately impact metric values.

  • Definition Ambiguity: Variability in defining individual lesions (especially confluent ones) may lead to inconsistent evaluations.

By evaluating each of the competing teams’ automated segmentation algorithms’ performance using lesion-wise metrics, we can identify state-of-the-art machine learning algorithm techniques. By doing so, we anticipate extension beyond the technical realm, to impacting patient outcomes, surgical approaches, radiation therapy planning, and understanding tumor behavior such as the propensity for an extra-axial location. As such, this study contributes to the technical field of medical imaging analysis and to the broader understanding of meningioma treatment and management strategies.

2 Methods

2.1 Challenge Data

BraTS Meningioma Challenge image data was contributed from 6 different United States academic medical centers: Duke University, Yale University, Thomas Jefferson University, University of California San Francisco, Missouri University, and University of Pennsylvania. Image data consisted of T1, T2, FLAIR, and T1Gd brain MRI sequences from patients with radiographic or pathologic diagnosis of intracranial meningioma. All data preprocessing was conducted using the FeTS tool, (Pati et al., 2022) and included conversion to NIfTI file format, co-registration, 1mm31superscriptmm31\,\text{mm}^{3} isotropic resampling to the SRI24 atlas space, and automated skull stripping (Schwarz et al., 2019; Thakur et al., 2019, 2020; Juluru et al., 2020; Smith, 2002). The skull stripping algorithm, part of the FeTS preprocessing workflow, was integral in removing non-brain tissue, including the skull and scalp, to isolate the intracranial structures. The brain extraction tool, widely used in neuroimaging pipelines, relies on deformable models and intensity-based thresholding to separate the brain from surrounding tissues (Smith, 2002). In this challenge, skull stripping was critical for preserving patient anonymity by preventing potential face reconstruction from MRI data and to standardize data preparation across institutions. However, it should be noted that meningiomas often extend through the skull and skull-base foramina, and any extra-cranial portions of the tumors were implicitly excluded by this process (Smith, 2002). Despite this limitation, skull stripping was applied to ensure consistency with other BraTS 2023 challenges and to minimize the inclusion of non-brain tissue. After image data pre-processing, tumor compartment labels were created using a comprehensive pre-segmentation, manual correction, and expert revision process, including enhancing tumor, non-enhancing tumor, and surrounding non-enhancing FLAIR hyperintensity (SNFH) as seen in Figure 2, which is an unmodified figure by LaBella et al. (LaBella et al., 2023, 2024). The initial pre-segmentation was performed using a deep convolutional neural network-based model (Isensee et al., 2021). Subsequently, 39 annotators manually reviewed and refined the segmentations. These annotators had varying levels of experience, from medical stu1dents to fellowship-trained neuroradiologists. Each annotator was given instructions on how to use the ITKSnap annotation software (Yushkevich et al., 2006) as well as a document on common errors of meningioma pre-segmentation as discussed in the dataset resource paper by LaBella et al (LaBella et al., 2024). After each annotator completed their manual corrections, the labels were sent to a final board-certified fellowship trained neuroradiologist (EC) for approval. This multi-step approach ensured the accuracy and consistency of the segmentation labels, incorporating multiple rounds of revision as needed to achieve high-quality final segmentations. All participating institutions received Institutional Review Board and Data Transfer Agreement approvals before contributing data, ensuring compliance with relevant regulatory authorities.

Refer to caption
Figure 1: Axial (left) and coronal (right) views of meningiomas at the most common locations in the skull. This is a modified figure as adapted from Murek under the CC-BY-4.0 license
Refer to caption
Figure 2: Meningioma sub-compartments considered in the BraTS Pre-operative Meningioma Dataset. Image panels A-C denote the different tumor sub-compartments included in manual annotations; (A) enhancing tumor (blue) visible on a T1-weighted post-contrast image; (B) the non-enhancing tumor core (red) visible on a T1-weighted post-contrast image; (C) the surrounding FLAIR hyperintensity (green) visible on a FLAIR image; (D) combined segmentations generating the final tumor sub-compartment labels provided in the BraTS Pre-operative Meningioma Dataset.

2.2 Challenge Procedures and Timeline

The BraTS 2023 Intracranial Meningioma Segmentation Challenge was hosted on the Synapse platform using the BraTS Pre-operative Meningioma Dataset (bra, 2023). To access the challenge dataset and to be eligible for submission of automated segmentation models, participants were required to register as a participating team on the Synapse platform. Registered teams developed automated segmentation algorithms that trained on multi-sequence MRI of pre-treatment intracranial meningioma with associated ground truth labels that were released to the participating teams in May 2023.

In June 2023, each of the participating teams had access to additional validation data consisting of multi-sequence MRI cases. For validation data, teams were able to assess segmentation performance of their models by submitting predicted labels through the Synapse platform, but individual ground truth segmentations were not made publicly available. From July until August 2023, participating teams utilized the validation dataset to fine tune their segmentation models and compose short paper manuscripts. At the end of the validation phase, each participating team uploaded their optimal automated segmentation model and respective manuscript as an MLCube container, which was used for evaluation in the testing phase. During the testing phase, the BraTS organizing committee internally evaluated each of the participating team’s automated segmentation models on the hidden test set of pre-operative meningioma cases with ground truth labels.

2.3 Algorithm Evaluation

During the testing phase, the BraTS organizing committee evaluated metrics on three regions of interest including ET, TC, and WT. ET was solely the enhancing tumor compartment label. TC was the combination of enhancing tumor and non-enhancing tumor compartment labels. WT was the combination of enhancing tumor, non-enhancing tumor, and SNFH compartment labels. Note that the term “whole tumor” was used across the BraTS 2023 cluster of challenges for consistency; however, this term is not entirely accurate for meningioma, where SNFH typically does not contain any tumor but rather represents associated vasogenic edema. Metrics used for evaluation included DSC and the 95HD and were evaluated on a lesion-wise level. The DSC is a measure used to quantify the similarity between two samples, which, in this context, refers to the overlap between automated segmentation and the expert annotated ground truth labels for each respective tumor compartment. The 95HD observed in the segmentation results. The 95HD was used in lieu of the standard 100% Hausdorff Distance to account for smaller lesions that may suffer from overestimates of the standard 100% Hausdorff Distance. For previous BraTS challenges, a global DSC was used for challenge rankings. However, lesion-wise metrics were adopted for the 2023 challenge as there was greater potential for multiple distinct lesions in a single patient image (most notably for the metastasis and meningioma sub-challenges). Distinct lesions were identified by performing a 1 voxel symmetric dilation on the ground truth WT masks, and then evaluating a 26-connectivity 3D connected component analysis to determine if overlap between distinct lesions exists (Rudie, 2023). A case’s lesion-wise DSC and 95HD scores are calculated based on equations (1) and (2) respectively, where L is the number of ground truth lesions and (true positive (TP) + false negative (FN)) is equal to L (Saluja, 2023). A predicted lesion is counted as a TP if at least 1 predicted voxel overlaps with the respective ground truth’s respective region of interest mask. A lesion is counted as a FN if the model does not predict any voxels within the ground truth’s respective region of interest mask. A predicted lesion is counted as a false positive (FP) if the model predicts a distinct lesion that does not overlap with any ground truth lesions’ voxels. The lesion-wise scoring system assigned a specific lesion’s region of interest a DSC score of 0 and a 95HD score of 374 for FP or FN. These equations effectively calculate the average DSC or 95HD values across all of the predicted lesions for a given case. The scoring system also excluded evaluation for ground truth lesions smaller than 50 voxels to avoid evaluation of false ground truth lesions missed in dataset review. This threshold was discussed and decided by fellowship trained neuroradiologists after ground truth label review (Saluja, 2023; Rudie, 2023). Evaluation of submissions was performed on MLCommons’ MedPerf, an open federated AI/ML evaluation platform (Karargyris et al., 2023). MedPerf automated the evaluating pipeline by running the participants’ models on the testing datasets of each contributing site’s data and calculating evaluation metrics on the resulting predictions. Finally, the Synapse platform retrieved the metrics results from the MedPerf server and ranked them to determine the winner (MLCommons Association, 2024; Pati et al., 2023; Karargyris et al., 2023).

Lesion-wise Dice Score=iLDice(Ii)TP+FN+FPLesion-wise Dice Scoresuperscriptsubscript𝑖𝐿Dicesubscript𝐼𝑖TPFNFP\text{Lesion-wise Dice Score}=\frac{\sum_{i}^{L}\text{Dice}(I_{i})}{\text{TP}+\text{FN}+\text{FP}}(1)
Lesion-wise 95HD=iLHD95(Ii)TP+FN+FPLesion-wise 95HDsuperscriptsubscript𝑖𝐿subscriptHD95subscript𝐼𝑖TPFNFP\text{Lesion-wise 95HD}=\frac{\sum_{i}^{L}\text{HD}_{95}(I_{i})}{\text{TP}+\text{FN}+\text{FP}}(2)

2.4 Participant Ranking and Workshop Proceedings

The BraTS organizing committee internally evaluated each of the participating team’s automated segmentation models on the hidden test set of pre-operative meningioma cases to determine lesion-wise metrics for both DSC and 95HD for each of the three regions of interest. The participants were ranked against each other for each region of interest’s lesion-wise metric independently. A total of 6 independent rankings were calculated to reflect the two metrics, DSC and 95HD, for each of the ET, TC, and WT regions of interest. Then a “BraTS segmentation score” was calculated based on the average of each independent lesion-wise region of interest metric rankings. For example, if a team had the 3rd best ET DSC, 2nd best TC DSC, 3rd best WT DSC, 3rd best ET 95HD, 2nd best TC 95HD, and 4th best WT 95HD, then that team would have an average ranking of (3+2+3+3+2+4) / 6 = 2.83 as their BraTS segmentation score. The BraTS segmentation score was used to determine the final participant rankings relative to one another. The three top-ranked teams were invited to present their findings at the BraTS workshop at the 2023 MICCAI Annual Meeting held in Vancouver, Canada; although final rank was hidden until the workshop. At the BraTS workshop, the BraTS organizing committee announced the final placement of the three top-ranked teams. Monetary awards of $1400, $1000, and $800 were presented to the three top-ranked teams, respectively.

2.5 Challenge Results Analysis

Overall participant and individual team statistical analysis of DSC and 95HD lesion-wise performance was performed using Python and Microsoft Excel (Excel). Analysis included calculation of participant average, standard deviation, and median DSC and 95HD for each region of interest; overall average and median DSC and 95HD across all participants for each region of interest; volume calculations of each lesion; and number of abutting voxels of lesions compared to the pre-processed brain MRI.

2.6 Analysis of Tumor Abutment of Brain Masks

Given the extra-axial location of meningiomas, we sought to evaluate the proportion of meningiomas that were potentially cropped or excluded by the automated skull stripping process. To determine the volume of each compartment label and the number of tumor compartment voxels that were directly abutting the edge of the skull-stripped images for each case, the NumberOfEdgeNeighbors.py script was internally run by BraTS organizers, which evaluated each of the 1483 meningioma MRI cases (LaBella, 2024). This analysis was performed internally due to restricted access to the hidden test dataset. To determine significance of association of WT volume compared to abutting voxels, the Pearson Correlation Coefficient was calculated with associated p-value with a significance level of 0.05.

3 Results

A total of 1000 training (70%) multi-sequence pre-operative meningioma MRI cases, 141 validation cases (10%), and 283 test cases (20%) were utilized within the BraTS Meningioma Challenge (Table 1) in adherence with standard machine learning protocols.

A total of 9 participating teams submitted automated segmentation models to MLCube for the BraTS Challenge 2023: Intracranial Meningioma. The statistical summary of the teams’ performances is outlined in Tables 2 and 3; which list the calculated DSCs and 95HD, respectively. The maximum recorded average DSC for ET, TC, and WT are 0.899, 0.904, and 0.871, respectively; and the minimum recorded average 95HD for ET, TC, and WT are 23.9, 21.8, and 31.4, respectively; highlighting the upper bounds of team performance within the challenge. The overall challenge summary statistics across all participating teams are listed in Table 4 for both DSC and 95HD. Figure 3 shows violin plots of DSC and 95HD scores for the ET, TC, and WT regions across all of the participating teams. Figure 4 shows a comparison of the predictions for a single testing set case from each of the top 3 participant’s algorithms.

Table 1: This table presents the total number of cases in each of the training, validation, and testing sets. Note that the training data was released with ground truth labels. Note that the validation data was released without ground truth labels. The included institutions are Duke University (Duke), Thomas Jefferson University (TJU), Missouri (Miss), University of Pennsylvania (Penn), University of California San Francisco (UCSF), and Yale University (Yale).
TrainValidationTest
Total Count1000141283
Duke3154691
TJU2363468
Miss1321633
Penn3149
UCSF1261835
Yale1602347
Release DateMay 2023June 2023Never released
Table 2: Team DSC scores, average ±plus-or-minus\pm SD (median), for ET, TC, and WT regions of interest. Combined team rankings for each respective metric.
Team NameET DSCTC DSCWT DSCRank (ET, TC, WT)
NVAUTO0.899 ±plus-or-minus\pm 0.189 (0.976)0.904 ±plus-or-minus\pm 0.180 (0.976)0.871 ±plus-or-minus\pm 0.198 (0.964)1, 1, 1
CNMC_PMI20230.876 ±plus-or-minus\pm 0.217 (0.968)0.867 ±plus-or-minus\pm 0.227 (0.968)0.851 ±plus-or-minus\pm 0.231 (0.953)2, 3, 2
blackbean0.870 ±plus-or-minus\pm 0.222 (0.969)0.879 ±plus-or-minus\pm 0.206 (0.969)0.845 ±plus-or-minus\pm 0.226 (0.957)3, 2, 3
Sherlock0.854 ±plus-or-minus\pm 0.234 (0.958)0.850 ±plus-or-minus\pm 0.239 (0.959)0.831 ±plus-or-minus\pm 0.244 (0.945)4, 4, 4
huilin0.830 ±plus-or-minus\pm 0.276 (0.959)0.820 ±plus-or-minus\pm 0.258 (0.958)0.761 ±plus-or-minus\pm 0.297 (0.897)5, 5, 6
i_sahajmistry0.799 ±plus-or-minus\pm 0.291 (0.954)0.773 ±plus-or-minus\pm 0.303 (0.949)0.764 ±plus-or-minus\pm 0.296 (0.932)6, 7, 5
Kurtlab-UW0.790 ±plus-or-minus\pm 0.237 (0.896)0.774 ±plus-or-minus\pm 0.250 (0.892)0.745 ±plus-or-minus\pm 0.257 (0.872)7, 6, 8
MIA0.775 ±plus-or-minus\pm 0.305 (0.940)0.757 ±plus-or-minus\pm 0.307 (0.941)0.751 ±plus-or-minus\pm 0.306 (0.916)8, 8, 7
UMNiverse0.007 ±plus-or-minus\pm 0.084 (0.000)0.027 ±plus-or-minus\pm 0.078 (0.006)0.241 ±plus-or-minus\pm 0.290 (0.092)9, 9, 9
Table 3: Team 95% Hausdorff distances, average ±plus-or-minus\pm SD (median), for ET, TC, and WT regions of interest. Combined team rankings for each respective metric.
Team NameET 95HDTC 95HDWT 95HDRank (ET, TC, WT)
NVAUTO23.9 ±plus-or-minus\pm 68.5 (0.96)21.8 ±plus-or-minus\pm 64.6 (1.0)31.4 ±plus-or-minus\pm 71.8 (1.0)1, 1, 1
CNMC_PMI202330.0 ±plus-or-minus\pm 80.9 (1.0)31.7 ±plus-or-minus\pm 83.5 (1.0)35.2 ±plus-or-minus\pm 86.8 (1.62)2, 3, 2
blackbean34.3 ±plus-or-minus\pm 82.0 (1.0)29.9 ±plus-or-minus\pm 75.0 (1.0)41.2 ±plus-or-minus\pm 84.3 (1.0)3, 2, 4
Sherlock34.3 ±plus-or-minus\pm 87.5 (1.07)35.1 ±plus-or-minus\pm 88.3 (1.93)39.7 ±plus-or-minus\pm 93.1 (1.94)4, 4, 3
Kurtlab-UW39.9 ±plus-or-minus\pm 90.2 (2.0)45.9 ±plus-or-minus\pm 95.8 (2.0)56.0 ±plus-or-minus\pm 100.4 (1.05)5, 5, 6
huilin46.9 ±plus-or-minus\pm 104.2 (1.0)47.7 ±plus-or-minus\pm 105.2 (1.41)55.9 ±plus-or-minus\pm 106.9 (3.61)6, 6, 5
i_sahajmistry56.5 ±plus-or-minus\pm 108.9 (1.0)64.1 ±plus-or-minus\pm 112.5 (1.41)66.2 ±plus-or-minus\pm 111.0 (2.0)7, 8, 8
MIA61.4 ±plus-or-minus\pm 117.6 (1.41)61.6 ±plus-or-minus\pm 118.0 (1.73)64.2 ±plus-or-minus\pm 119.5 (2.83)8, 7, 7
UMNiverse371.4 ±plus-or-minus\pm 314.9 (374.0)150.1 ±plus-or-minus\pm 151.5 (81.5)158.8 ±plus-or-minus\pm 146.9 (189.1)9, 9, 9
Refer to caption
Figure 3: Violin plots of DSC and 95HD scores for the ET, TC, and WT regions across all of the participating teams. The subplots are organized as: A1 (ET DSC), A2 (TC DSC), A3 (WT DSC), B1 (ET 95HD), B2 (95HD TC), B3 (95HD WT).
Refer to caption
Figure 4: Image panels demonstrating the different predictions of the top 3 teams for a pre-operative meningioma testing set case as seen on T1Gd (top row) and FLAIR (bottom row) MRI.

.

Table 4: Summary statistics for DSC and 95% Hausdorff Distance (95HD) across ET, TC, and WT regions of interest for 9 participating teams in a segmentation task. The DSC (Dice Similarity Coefficient) and 95HD metrics are presented with their respective statistics: Average, Std (Standard Deviation), Median, (Q1, Q3) (1st and 3rd Quartiles), and either Max or Min values as applicable.
StatisticET DSCTC DSCWT DSCET 95HDTC 95HDWT 95HD
Average0.8300.8200.76439.946.055.9
Std0.2340.2390.25787.595.76100.4
Median0.9580.9580.9331.001.4142.00
(Q1, Q3)[0.872, 0.981][0.850, 0.979][0.630, 0.972][1.00, 4.24][1.00, 6.27][1.00, 14.5]
Max Avg0.8990.9040.87123.921.831.4
Max Med0.9760.9760.9641.001.001.00

The top 3 ranked teams for the BraTS Meningioma Challenge included NVAUTO, blackbean, and CNMC_PMI2023 (Capellan-Martin, 2023; Myronenko et al., 2023; Huang et al., 2023b). Each of these teams were invited to give an oral presentation of their findings and methods at MICCAI 2023 in Vancouver, Canada. Their lesion-wise DSC and 95HD and median lesion-wise DSC, averaged over ET, TC, and WT are listed in Table 5. NVAUTO’s winning MONAI Auto3DSeg framework and blackbean’s STU-Net framework are open-source and freely accessible (Apache License 2.0) (Myronenko et al., 2023; Myronenko, 2018; Consortium, 2020; Huang et al., 2023b, a). CNMC_PMI2023 utilized an ensemble of nnUNet, an open-source and freely accessible Apache License 2.0 model, and SWIN-transformer, a freely accessible MIT License model (Capellan-Martin, 2023). The team leaders, submitted short paper titles, and technical aspects of their algorithms are listed below:

  1. 1.

    NVAUTO: Andriy Myronenko et al., Auto3DSeg for Brain Tumor Segmentation from 3D MRI in BraTS 2023 Challenge (Myronenko et al., 2023).

    NVAUTO employed the Auto3DSeg tool from MONAI for brain tumor segmentation using 3D MRI scans (Myronenko, 2018; Consortium, 2020; Myronenko et al., 2023). The core of the model architecture used in the challenge was SegResNet, a U-Net based convolutional neural network designed for semantic segmentation tasks. This model utilizes an encoder-decoder structure, incorporating repeated ResNet blocks with batch normalization and deep supervision, which helps guide training through multiple layers of the network (Myronenko, 2018). To improve performance and robustness, several data augmentation techniques were applied, including random affine transformations, flipping, intensity scaling, shifting, noise addition, and blurring. These augmentations help the model generalize better by simulating variations in MRI data. The loss function used for training combined Dice loss and focal loss, with the goal of addressing class imbalance by emphasizing harder-to-segment areas and penalizing inaccurate segmentations of minority classes. Additionally, the loss was summed across deep-supervision sublevels, meaning the network computed losses at various resolution scales to refine the segmentation. The optimizer employed was AdamW, with an initial learning rate of 2e42superscripte42\,\text{e}^{-4}, gradually reduced to zero using the cosine annealing scheduler. This adaptive optimization method, combined with a learning rate decay, ensures better convergence and prevents overfitting. Weight decay regularization of 1e51superscripte51\,\text{e}^{-5} was also used to prevent overfitting by penalizing large weights in the model. The Auto3DSeg framework was designed to be user-friendly, requiring minimal manual input. It automates several stages of the training and optimization process, making it accessible even to non-experts. Advanced users can fine-tune various parameters, such as hyperparameters and model architecture, for improved performance. The training setup leveraged 8 NVIDIA V100 GPUs with 16 GB of memory each, and a 5-fold cross-validation process was used to ensure generalizability across different MRI datasets, further improving the model’s accuracy and robustness.

  2. 2.

    blackbean: Ziyan Huang et al., Evaluating STU-Net for Brain Tumor Segmentation (Huang et al., 2023b).

    Blackbean utilized the Scalable and Transferable U-Net (STU-Net) model in the 2023 BraTS Challenge (Huang et al., 2023b, a). STU-Net builds upon the nnU-Net architecture but introduces key modifications to enhance its scalability and transferability for large-scale medical image segmentation tasks (Huang et al., 2023a; Isensee et al., 2021). The model’s architecture ranges from 14 million to 1.4 billion parameters, enabling flexibility depending on computational resources. The core innovation lies in the incorporation of residual connections to address gradient diffusion and downsampling blocks within each encoder stage for more efficient feature extraction. STU-Net also utilizes nearest-neighbor interpolation with a 1×1×1 convolution layer for upsampling, which improves the model’s ability to generalize and transfer learning across different imaging tasks (Huang et al., 2023a). A compound scaling strategy ensures balanced growth of both encoder and decoder components, optimizing both depth and width. Pre-trained on the TotalSegmentator dataset, which covers 104 foreground classes, the model demonstrates robust transferability to the BraTS brain tumor segmentation task (Wasserthal et al., 2022). Blackbean adhered to the default data pre-processing, data augmentation, and training procedures provided by nnU-Net, and they utilized the SGD optimizer with a Nesterov momentum of 0.99 and a weight decay of 1e31superscripte31\,\text{e}^{-3} (Isensee et al., 2021). The batch size was fixed at 2, and each epoch consisted of 250 iterations. The learning rate decay followed the poly learning rate policy: (1epoch/1000)0.9superscript1𝑒𝑝𝑜𝑐10000.9{(1-epoch/1000)}^{0.9}. Data augmentation techniques used during training included additive brightness, gamma, rotation, scaling, mirror, and elastic deformation. The pre-training patch size on the TotalSegmentator dataset was 128 × 128 × 128 (Wasserthal et al., 2022). Fine-tuning patch sizes on downstream tasks were automatically configured by nnU-Net (Isensee et al., 2021).

  3. 3.

    CNMC_PMI2023: Daniel Capellán-Martín et al., Model Ensemble for Brain Tumor Segmentation in Magnetic Resonance Imaging (Capellan-Martin, 2023).

    CNMC_PMI2023 used an ensemble-based approach. The ensemble strategy combines two state-of-the-art deep learning models: nnU-Net and Swin UNETR (Isensee et al., 2021; Tang et al., 2022). The 3D nnU-Net model was trained using five-fold cross-validation, with input images divided into patches of 128 x 160 x 112 (Isensee et al., 2021). The output consisted of three channels corresponding to the three tumor sub-regions. A combined Dice loss and cross-entropy loss function was employed, optimized using the stochastic gradient descent (SGD) algorithm with Nesterov momentum (learning rate: 0.01, momentum: 0.99, weight decay: 3e53superscripte53\,\text{e}^{-5}). Inference was conducted using a sliding window approach. The vision transformer-based 3D Swin UNETR model was trained using five-fold cross-validation, with input patches of 96 x 96 x 96 voxels (Tang et al., 2022). The output was four channels: three tumor sub-regions and background. The combined Dice loss and focal loss function was optimized using the AdamW optimizer (learning rate: 0.0001, momentum: 0.99, weight decay: 3e53superscripte53\,\text{e}^{-5}). To improve segmentation accuracy and robustness, predictions from nnU-Net and Swin UNETR were ensembled. The ensembling process involved combining outputs for the tumor regions (WT, TC, and ET) from each model across the five cross-validation folds. Given the task’s emphasis on lesion-wise evaluation, a post-processing step was developed to clean small disconnected regions ¡50 voxels. Training for nnU-Net models was conducted on an NVIDIA A100 GPU with 40GB of memory, while Swin UNETR models were trained on NVIDIA A5000 (24GB) and A6000 (48GB) GPUs. Hyperparameter optimization was carried out using the Optuna framework (Akiba et al., 2019).

Table 5: This table shows the lesion-wise metrics for each of the top 3 ranked teams in the BraTS Challenge 2023: Intracranial Meningioma challenge. Note that a precision of 4 decimals is used due to the close final results amongst top participating teams.
AverageMedianAverage
Team NameDSCDSC95HD
NVAUTO0.89090.985525.70
Blackbean0.86430.986134.84
CNMC_PMI20230.86380.985532.30

Of note, NVAUTO placed in the top 2 for each of the five distinct BraTS 2023 automated segmentation challenges. NVAUTO came in first place for Meningioma, BraTS-Africa, Brain Metastases; and came in second place for Adult Glioma and Pediatric Tumors. For the meningioma challenge, NVAUTO had a total of 228 of 283 testing phase cases with a tumor core DSC \geq 0.90. Additionally, during the public validation phase, as described in their in-person oral presentation at MICCAI, NVAUTO reported achieving an average composite DSC of 0.935, which is substantially higher than their testing phase top score of 0.891, which suggests some degree of overfitting (Myronenko et al., 2023).

3.1 Notable Challenge Cases and Statistics

The top median DSC across all participants for a specific case was a perfect score of 1.00 for enhancing tumor and tumor core as shown in Figure 5. Note that there was a total of 33 enhancing tumor voxels, and 23 were abutting the edge of the MR image. The cases with the next highest overall median and average DSC are shown in Figure 6. Note that the ET volumes are qualitatively much greater in these particularly high DSC cases.

There were two meningioma cases with somewhat unusual imaging appearance as shown in Figures 7 and 8 which had poor performance for test performance metrics across all participants. Notably, they had a majority of non-enhancing tumor making up the TC and WT. These lesions correspond to heavily calcified lesions with little or no visible enhancement and low signal intensity on all provided sequences.

Refer to caption
Figure 5: Image panels of the top scored individual test case with a median participant DSC of 1.00, 1.00, and 1.00 and average participant DSC of 0.993, 0.881, and 0.882 for enhancing tumor, tumor core, and whole tumor; respectively. Tumor ground truth sub-compartment labels annotated on axial (A), sagittal (B), and coronal (C) views of a T1Gd MRI head case. Panel D depicts the tumor abutting the edge of the skull-stripped brain without a label.
Refer to caption
Figure 6: Image panels that depict participants median enhancing tumor and tumor core DSC of (A) 0.993, (B) 0.991, (C) 0.991, and (D) 0.991 (0.990 for TC). Note that the averages were 0.876, 0.878, 0.864, 0.875 respectively, which is due to a single team having a DSC of approximately 0.002 for each case, whereas every other team had an ET DSC above 0.95 for each case.
Refer to caption
Figure 7: Image panels depicting a completely calcified extra-axial non-enhancing meningioma that had a median participant DSC of 1.00, 0.00, and 0.00 and average participant DSC of 0.888, 0.175, and 0.174 for enhancing tumor, tumor core, and whole tumor; respectively.
Refer to caption
Figure 8: MRI study demonstrating the worst performing meningioma case for NVAUTO with an ET DSC of 0.00, TC DSC of 0.00, and WT DSC of 0.338. The blue and red labels represent ground truth enhancing tumor and non-enhancing tumor, respectively, which combined make up the tumor core region of interest.
Refer to caption
Figure 9: Scatterplot graph depicting the overall participant median DSC for each region of interest to the WT volume for each of the test cases.

Each region of interest’s DSC was compared to WT volume for each respective case as shown in Figure 8. There was a nonsignificant positive linear correlation between DSC vs WT volume for each of the three regions of interest, ET, TC, and WT; with p values of 0.696, 0.689, and 0.741, respectively. There was a nonsignificant logarithmic correlation between DSC vs WT volume for each of the three regions of interest, ET, TC, and WT; with p values of 0.102, 0.093, and 0.200, respectively (not shown in figure). Notably, Figure 9 also demonstrates a significant number of cases with a lesion-wise DSC of approximately 0.5 for each of the regions of interest. This is due to the lesion-wise metrics penalizing false positives with a value of 0 for the respective prediction and false negatives with a value of 0 for missed ground truth lesions; combined with a very strong performance for another ground truth lesion.

Skull-stripping resulted in 908 of 1000 training cases, 129 of 141 validation cases, and 257 of 283 test cases’ meningioma labels (1286 of 1424, 90.3% overall) having at least 1 compartment voxel abutting the edge of the skull-stripped image edge as seen in the example case in Figure 10. Of the 257 aforementioned cases, the average and median number of abutting voxels was 628.7 and 394 voxels respectively. Figure 11 shows the relationship between the number of abutting voxels and the WT volume (R2superscriptR2\text{R}^{2} = 0.190 and p = 0.002).

Refer to caption
Figure 10: Image panels with the tumor sub-regions annotated on sagittal views of T1Gd MRI head cases. Panels A1 and A2 represent a meningioma tumor with 4514 voxels abutting the skull-stripped boundary. Panels B1 and B2 represent a meningioma tumor with 3911 voxels abutting the skull-stripped boundary.
Refer to caption
Figure 11: Scatterplot depicting the relationship between WT size and the number of abutting voxels.

While global metrics (i.e. those used in previous BraTS challenges) were not used for ranking of the 2024 challenge, we have included Figure 12 to show the relationship between the complete tumor volume and each of the global metrics in cases with only a single lesion for each of the regions of interest. Note that there is a noticeable trend towards improved global DSC, global 95HD, and global sensitivity as complete volume increases.

Refer to caption
Figure 12: Plots of sliding window average global DSC, global 95HD, and sensitivity for each of the ET, TC, and WT regions of interest for the subset of BraTS testing cases that only had 1 ground truth lesion. The number of subjects with a single lesion is ET (N=255), TC (N=255), and WT (N=252), for each respective label.

4 Discussion

4.1 Challenge Summary

The BraTS 2023 Intracranial Meningioma Segmentation Challenge provided unprecedented insight into the state-of-the-art performance in pre-operative meningioma segmentation, leveraging the largest multi-institutional systematically expert annotated multilabel meningioma MRI dataset to date (Calabrese and LaBella, 2023; LaBella et al., 2024). The challenge saw remarkable performances, particularly from the NVAUTO team, in DSC and 95HD across ET, TC, and WT region of interest segmentation tasks. It was notable that DSC scores were, on average, higher for the meningioma challenge compared to all other BraTS 2023 segmentation sub-challenges, which may be due to the relatively lower complexity of meningiomas compared to other tumor types and/or the relatively high quality and consistency of the meningioma dataset. These lesion-wise DSC and 95HD scores should be considered as benchmarks for future pre-operative meningioma segmentation evaluation. Note that lesion-wise metrics are essential for segmentation tasks with potential for more than one distinct lesion as global metrics may still remain relatively high even if a single, smaller lesion is completely missed by the segmentation algorithm.

4.2 The Best Algorithm and Caveats

The superior performance of NVAUTO, with lesion-wise median DSC values of 0.976, 0.976, and 0.964 lesion-wise average DSC values of 0.899, 0.904, and 0.871, for ET, TC, and WT respectively, signifies a notable advancement in automated meningioma segmentation algorithms. These results not only demonstrate the feasibility of achieving high accuracy in meningioma segmentation, but also suggest that deep learning models can effectively adapt to the diverse morphology and anatomical locations of meningiomas. NVAUTO’s base algorithm, known as AutoSeg3D, an open-source and Pytorch-based framework, is particularly adaptable to a variety of medical imaging automated segmentation challenges (Myronenko et al., 2023). Auto3DSeg allows for auto-scaling to available GPUs; 5-fold training with SegResNet, DiNTS, and SwinUNETR models; and performing inference and ensembling using each of the multiple trained models. For the challenges, NVAUTO found the SegResNet model to be the most accurate and was ultimately the optimal model architecture selected by Auto3DSeg for each of the automated segmentation challenges (Myronenko et al., 2023). Additionally, Auto3DSeg, allows for use of a variety of image modalities and is not limited to just MRI. Auto3DSeg is advertised to run with a GPU RAM \geq 8GB. However, for the 2023 BraTS challenges, NVAUTO used 8 x 16GB NVIDIA V100 machines (Myronenko et al., 2023). Future studies should assess the ability to use Auto3DSeg on more widely available, consumer grade GPUs and non-NVIDIA branded GPUs.

4.3 Overall Segmentation Performance

The challenge results revealed a broad range of performances among the 9 participating teams, with DSC scores for the ET ranging from 0.899 to 0.007. Such variability underscores the challenge’s complexity and the diverse approaches to segmentation. When comparing NVAUTO’s meningioma segmentation performance compared to their performance on other challenges, it was found that they performed the best on meningioma’s DSC for ET and TC regions of interest but performed only second best for DSC for WT as shown in Table 6. NVAUTO’s algorithm showed that Sub-Saharan Africa gliomas had a higher DSC for WT than all other tumor types. Since meningioma tend to have a smaller overall whole tumor volume and a higher ET or TC to WT ratio compared to glioblastoma, it can be hypothesized that there is less available training information to create as accurate SNFH compartment labels; and thereafter, the WT region of interest (Ogasawara et al., 2021; Gilard et al., 2021; Khandwala et al., 2021).

Table 6: Average DSC for ET, TC, and WT for the top performing team, NVAUTO, using their algorithm, Auto3DSeg, across each of the different BraTS 2023 segmentation challenges. Note that 3 decimal precision is shown for meningioma to emphasize the small increase in DSC for non-ET compared to ET.
ETTCWT
(Avg DSC)(Avg DSC)(Avg DSC)
Meningioma0.8990.9040.870
Glioma0.8100.8300.840
Sub-Saharan0.7900.8400.910
Africa
Metastasis0.6000.6500.620
Pediatric0.5500.7800.840

4.4 Limitations of the BraTS Meningioma Benchmark

Our analysis revealed that all participating teams performed relatively poorly on heavily calcified meningiomas with little or no enhancing tumor. For example, teams had an average TC DSC of only 0.156 for one particular non-enhancing meningioma case. This relatively poor performance is presumably related to the relative rarity of this imaging appearance of meningioma, and the fact that only a small number of such cases were included in the training dataset. Future datasets should include more cases of exclusively non-enhancing tumor to allow for more generalizable automated segmentation models. It is important to consider that radiotherapy plans only consider a single GTV represented by the TC, and whether these indeterminate TC regions are labeled as non-enhancing vs enhancing regions would not impact the resulting treatment volumes (Rogers et al., 2017, 2020).

Note that in Figure 9, the lines of best fit for ET, TC, and WT trend towards improved DSC with increased WT volume. In an automated segmentation challenge, this could cause a higher perceived test set overall DSC score if a larger proportion of larger tumors are included in the test set compared to the overall population. Therefore, it is important to ensure balance within the training, validation, and test sets regarding tumor size, which was not explicitly done for this iteration of the challenge.

Another notable limitation of our study is the absence of explicit testing on out-of-distribution (OOD) cases. While our model demonstrated strong performance on the provided BraTS 2023 meningioma dataset, all data were derived from a limited number of institutions, and no evaluation was performed on data from external sources or significantly different MRI acquisition protocols. Consequently, the generalization ability of the model to cases from different populations, MRI machines, or acquisition settings remains untested. Future work should focus on assessing the model’s performance on OOD data to ensure its robustness and applicability in real-world clinical environments. Techniques such as domain adaptation and cross-institutional validation will be crucial to improve the model’s generalization capabilities and reliability across diverse clinical settings.

4.5 Future Directions

Future studies involving meningioma automated segmentation should focus on the most important clinical volumes, particularly the tumor core which comprises the radiotherapy gross tumor volume (GTV). Additionally, future studies should focus on the segmentation of meningioma along the intra-axial and extra-axial border with various face anonymization pre-processing techniques, due to the high frequency for meningioma to be excluded by skull-stripping as demonstrated by our results (Watts et al., 2014; Schwarz et al., 2019, 2021).

This study performed an analysis of the propensity of meningioma to abut the skull-stripped image, thereby having the potential to exclude portions of the meningioma within the skull-stripped image. Due to skull-stripping resulting in 1286 of 1483 meningioma having at least 1 compartment voxel abutting the boundary of the skull-stripped image, future studies should evaluate a different pre-processing anonymization technique that will allow for inclusion of more volume of intracranial meningioma. Schwarz et al. describes a promising mri_reface technique that performs face anonymization by modifying the MR head image face and ear regions to represent an average human face and ear, thereby preserving the vast majority of the MR head image (Schwarz et al., 2019, 2021). Bischoff et al. describes mri_deface, a defacing tool that removes facial features by assigns a probability of voxel being “face” or “brain” and removes voxels that have non-zero probability of being “face” but zero probability of being “brain” (Theyers et al., 2021; Bischoff-Grethe et al., 2007).

Furthermore, while current segmentation approaches predominantly utilize MRI, atypical meningiomas—such as those that are completely non-enhancing or heavily calcified—remain challenging to automate segmentation due to low signal intensity and limited contrast. In these cases, integrating additional imaging modalities could prove highly beneficial. PET imaging, especially with tracers like Ga68superscriptGa68{}^{68}\mathrm{Ga}-DOTATATE, provides metabolic information that can help differentiate viable tumor tissue from calcified or fibrotic areas (Prasad et al., 2022). Likewise, CT offers superior spatial resolution for delineating calcifications and osseous involvement, which is critical when meningiomas extend into or invade bone (Salah et al., 2019). Indeed, studies have demonstrated that PET and CT can outperform MRI in identifying bony invasion and calcification in meningiomas (Galldiks et al., 2023; Salah et al., 2019). Thus, the incorporation of multi-modal imaging may lead to more robust segmentation algorithms capable of addressing the full spectrum of meningioma presentations.

RTOG 0539, a phase II trial of observation for low-risk meningiomas and of radiotherapy for intermediate- and high-risk meningiomas, describes radiation treatment planning and target volume protocols that should be used for meningiomas (Rogers et al., 2017, 2020). For radiation planning, they only required use of pre-operative and post-operative contrast-enhanced MRI. RTOG 0539 defines the GTV to encompass the tumor bed on postoperative-enhanced MRI and to include any residual nodular enhancement. They also state that trailing linear dural tail and cerebral edema should not be specifically included within the GTV, since there is no evidence that recurrence is more likely within the dural tail (Rogers et al., 2017, 2020).

Therefore, future studies that focus on automated segmentation of meningioma for radiation therapy planning should place emphasis on evaluation of the enhancing tumor and post-op resection bed volumes on post-operative T1Gd treatment planning MRI, while reducing emphasis on the surrounding non-enhancing FLAIR hyperintensity compartment and the small trailing linear dural tail enhancement.

However, in the post-treatment follow-up setting, RTOG 0539 still required the use of T1, T2, FLAIR, and T1Gd series, which is similar to the imaging used in the BraTS 2023 Challenge: Intracranial Meningioma. For this challenge’s pre-operative meningioma dataset, the enhancing and non-enhancing tumor labels compose the tumor-core. The SNFH label representing edema, which was included in the WT region of interest is not typically included within radiation therapy meningioma target volumes.

5 Conclusion

The BraTS 2023 Intracranial Meningioma Segmentation Challenge has marked a significant step forward in the segmentation of meningioma tumors, highlighting both the potential and limitations of current methodologies. As the field moves forward, a focus on enhancing dataset diversity, refining pre-processing techniques, and tailoring segmentation tasks to specific clinical needs will be crucial in translating these advancements into clinical practice.


Acknowledgments

Research reported in this publication was partly supported by the National Institutes of Health (NIH) under award numbers: NCI K08CA256045, NCI/ITCR U24CA279629, and NCI/ITCR U01CA242871. The content of this publication is solely the responsibility of the authors and does not represent the official views of the NIH.

Developing large and well curated mpMRI datasets for automated segmentation model development requires significant time and expertise from neuro-radiology experts. We are grateful to everyone who contributed to the development and review of the tumor volume labels including volunteer annotators/approvers from the American Society of Neuroradiology.


Ethical Standards

The work follows appropriate ethical standards in conducting research and writing the manuscript, following all applicable laws and regulations regarding treatment of animals and human subjects. All participating sites had institutional review board (IRB) approval. A waiver for informed consent was provided by each institution’s respective IRB.


Conflicts of Interest

We declare we do not have conflicts of interest.


Data availability

The BraTS Meningioma Pre-operative Dataset training (1,000/1,424, 70%) and validation (141/1,424, 10%) data are publicly available on Synapse (Calabrese and LaBella, 2023; LaBella et al., 2024). The testing dataset (283/1,424, 20%) will be kept private for the foreseeable future to allow for the unbiased assessment of future segmentation algorithms.

References

  • bra (2023) Synapse: Brain tumor segmentation (brats) cluster of challenges. https://www.synapse.org/#!Synapse:syn51156910/wiki/, 2023.
  • Akiba et al. (2019) Takuya Akiba, Shotaro Sano, Toshihiko Yanase, Takeru Ohta, and Masanori Koyama. Optuna: A next-generation hyperparameter optimization framework. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2019.
  • Baid et al. (2021) Ujjwal Baid, Satyam Ghodasara, Suyash Mohan, Michel Bilello, Evan Calabrese, Errol Colak, Keyvan Farahani, Jayashree Kalpathy-Cramer, Felipe C Kitamura, Sarthak Pati, et al. The rsna-asnr-miccai brats 2021 benchmark on brain tumor segmentation and radiogenomic classification. arXiv preprint arXiv:2107.02314, 2021.
  • Bakas et al. (2017) Spyridon Bakas, Hamed Akbari, Aristeidis Sotiras, Michel Bilello, Martin Rozycki, Justin S Kirby, John B Freymann, Keyvan Farahani, and Christos Davatzikos. Advancing the cancer genome atlas glioma mri collections with expert segmentation labels and radiomic features. Scientific data, 4(1):1–13, 2017.
  • Bischoff-Grethe et al. (2007) Amanda Bischoff-Grethe, I Burak Ozyurt, Evelina Busa, Brian T Quinn, Christine Fennema-Notestine, Camellia P Clark, Shaunna Morris, Mark W Bondi, Terry L Jernigan, Anders M Dale, et al. A technique for the deidentification of structural brain mr images. Human brain mapping, 28(9):892–903, 2007.
  • Bouget et al. (2022) David Bouget, André Pedersen, Asgeir S Jakola, Vasileios Kavouridis, Kyrre E Emblem, Roelant S Eijgelaar, Ivar Kommers, Hilko Ardon, Frederik Barkhof, Lorenzo Bello, et al. Preoperative brain tumor imaging: Models and software for segmentation and standardized reporting. Frontiers in neurology, 13:932219, 2022.
  • Calabrese and LaBella (2023) E. Calabrese and D. LaBella. 2023 brats meningioma dataset. https://www.synapse.org/#!Synapse:syn51514106, 2023.
  • Capellan-Martin (2023) D. Capellan-Martin. Brats challenge 2023: Model ensemble for brain tumor segmentation in mri. In MICCAI, Vancouver, Canada, 2023.
  • Consortium (2020) The MONAI Consortium. Project monai. https://doi.org/10.5281/zenodo.4323059, 2020.
  • Galldiks et al. (2023) Norbert Galldiks, Nathalie L Albert, Michael Wollring, Jan-Michael Werner, Philipp Lohmann, Javier E Villanueva-Meyer, Gereon R Fink, Karl-Josef Langen, and Joerg-Christian Tonn. Advances in pet imaging for meningioma patients. Neuro-oncology advances, 5(Supplement_1):i84–i93, 2023.
  • Gilard et al. (2021) Vianney Gilard, Abdellah Tebani, Ivana Dabaj, Annie Laquerrière, Maxime Fontanilles, Stéphane Derrey, Stéphane Marret, and Soumeya Bekri. Diagnosis and management of glioblastoma: A comprehensive perspective. Journal of personalized medicine, 11(4):258, 2021.
  • Gordillo et al. (2013) Nelly Gordillo, Eduard Montseny, and Pilar Sobrevilla. State of the art survey on mri brain tumor segmentation. Magnetic resonance imaging, 31(8):1426–1438, 2013.
  • Havaei et al. (2017) Mohammad Havaei, Axel Davy, David Warde-Farley, Antoine Biard, Aaron Courville, Yoshua Bengio, Chris Pal, Pierre-Marc Jodoin, and Hugo Larochelle. Brain tumor segmentation with deep neural networks. Medical image analysis, 35:18–31, 2017.
  • Huang et al. (2023a) Z. Huang, H. Wang, Z. Deng, J. Ye, Y. Su, H. Sun, J. He, Y. Gu, L. Gu, and S. Zhang. Stu-net: Scalable and transferable medical image segmentation models empowered by large-scale supervised pre-training. arXiv preprint arXiv:2304.06716, 2023a.
  • Huang et al. (2023b) Z. Huang et al. Exploring model size and patch size for brats23 challenge. In MICCAI, Vancouver, Canada, 2023b.
  • Isensee et al. (2021) Fabian Isensee, Paul F Jaeger, Simon AA Kohl, Jens Petersen, and Klaus H Maier-Hein. nnu-net: a self-configuring method for deep learning-based biomedical image segmentation. Nature methods, 18(2):203–211, 2021.
  • Işın et al. (2016) Ali Işın, Cem Direkoğlu, and Melike Şah. Review of mri-based brain tumor image segmentation using deep learning methods. Procedia Computer Science, 102:317–324, 2016.
  • Juluru et al. (2020) Krishna Juluru, Eliot Siegel, and Jan Mazura. Identification from mri with face-recognition software. The New England Journal of Medicine, 382(5):489–490, 2020.
  • Karargyris et al. (2023) Alexandros Karargyris, Renato Umeton, Micah J Sheller, Alejandro Aristizabal, Johnu George, Anna Wuest, Sarthak Pati, Hasan Kassem, Maximilian Zenk, Ujjwal Baid, et al. Federated benchmarking of medical artificial intelligence with medperf. Nature Machine Intelligence, 5(7):799–810, 2023.
  • Kazerooni et al. (2023) Anahita Fathi Kazerooni, Nastaran Khalili, Xinyang Liu, Debanjan Haldar, Zhifan Jiang, Syed Muhammed Anwar, Jake Albrecht, Maruf Adewole, Udunna Anazodo, Hannah Anderson, et al. The brain tumor segmentation (brats) challenge 2023: Focus on pediatrics (cbtn-connect-dipgr-asnr-miccai brats-peds). ArXiv, 2023.
  • Khandwala et al. (2021) Kumail Khandwala, Fatima Mubarak, and Khurram Minhas. The many faces of glioblastoma: Pictorial review of atypical imaging features. The Neuroradiology Journal, 34(1):33–41, 2021.
  • LaBella (2024) Dominic LaBella. MeningiomaAnalysis. https://doi.org/10.5281/zenodo.13936365, 2024.
  • LaBella et al. (2023) Dominic LaBella, Maruf Adewole, Michelle Alonso-Basanta, Talissa Altes, Syed Muhammad Anwar, Ujjwal Baid, Timothy Bergquist, Radhika Bhalerao, Sully Chen, Verena Chung, et al. The asnr-miccai brain tumor segmentation (brats) challenge 2023: Intracranial meningioma. arXiv preprint arXiv:2305.07642, 2023.
  • LaBella et al. (2024) Dominic LaBella, Omaditya Khanna, Shan McBurney-Lin, Ryan Mclean, Pierre Nedelec, Arif S. Rashid, Nourel hoda Tahon, Talissa Altes, Ujjwal Baid, Radhika Bhalerao, Yaseen Dhemesh, Scott Floyd, Devon Godfrey, Fathi Hilal, Anastasia Janas, Anahita Kazerooni, Collin Kent, John Kirkpatrick, Florian Kofler, Kevin Leu, Nazanin Maleki, Bjoern Menze, Maxence Pajot, Zachary J. Reitman, Jeffrey D. Rudie, Rachit Saluja, Yury Velichko, Chunhao Wang, Pranav I. Warman, Nico Sollmann, David Diffley, Khanak K. Nandolia, Daniel I Warren, Ali Hussain, John Pascal Fehringer, Yulia Bronstein, Lisa Deptula, Evan G. Stein, Mahsa Taherzadeh, Eduardo Portela de Oliveira, Aoife Haughey, Marinos Kontzialis, Luca Saba, Benjamin Turner, Melanie M. T. Brüßeler, Shehbaz Ansari, Athanasios Gkampenis, David Maximilian Weiss, Aya Mansour, Islam H. Shawali, Nikolay Yordanov, Joel M. Stein, Roula Hourani, Mohammed Yahya Moshebah, Ahmed Magdy Abouelatta, Tanvir Rizvi, Klara Willms, Dann C. Martin, Abdullah Okar, Gennaro D’Anna, Ahmed Taha, Yasaman Sharifi, Shahriar Faghani, Dominic Kite, Marco Pinho, Muhammad Ammar Haider, Michelle Alonso-Basanta, Javier Villanueva-Meyer, Andreas M. Rauschecker, Ayman Nada, Mariam Aboian, Adam Flanders, Spyridon Bakas, and Evan Calabrese. A multi-institutional meningioma mri dataset for automated multi-sequence image segmentation. Scientific Data, 11:496, 2024. . URL https://doi.org/10.1038/s41597-024-03350-9.
  • Martz et al. (2022) N Martz, J Salleron, F Dhermain, G Vogin, JF Daisne, R Mouttet Audouard, R Tanguy, G Noel, M Peyre, I Lecouillard, et al. Anocef consensus guideline on target volume delineation for meningiomas radiotherapy. International Journal of Radiation Oncology, Biology, Physics, 114(3):e46, 2022.
  • Menze et al. (2014) Bjoern H Menze, Andras Jakab, Stefan Bauer, Jayashree Kalpathy-Cramer, Keyvan Farahani, Justin Kirby, Yuliya Burren, Nicole Porz, Johannes Slotboom, Roland Wiest, et al. The multimodal brain tumor image segmentation benchmark (brats). IEEE transactions on medical imaging, 34(10):1993–2024, 2014.
  • MLCommons Association (2024) MLCommons Association. Mlcube: Standardizing ml deployment. https://mlcommons.org/working-groups/data/mlcube/, 2024. Accessed: 2024-05-08.
  • Moawad et al. (2023) Ahmed W Moawad, Anastasia Janas, Ujjwal Baid, Divya Ramakrishnan, Leon Jekel, Kiril Krantchev, Harrison Moy, Rachit Saluja, Klara Osenberg, Klara Wilms, et al. The brain tumor segmentation (brats-mets) challenge 2023: Brain metastasis segmentation on pre-treatment mri. ArXiv, 2023.
  • Murek (2024) Michael Murek. Localization of intracranial meningiomas, 2024. URL https://neurochirurgie.insel.ch/en/what-we-treat/brain-tumor/meningioma. Image: University Department of Neurosurgery, Inselspital Bern © CC BY-NC 4.0. Left: Axial view of meningiomas at the base of the skull. Right: Coronal view of meningiomas with location at the cranial dome, the falx cerebri as well as intraventricular.
  • Myronenko et al. (2023) A. Myronenko, D. Yang, Y. He, and D. Xu. Auto3dseg for brain tumor segmentation from 3d mri in brats 2023 challenge. In MICCAI, Vancouver, Canada, 2023.
  • Myronenko (2018) Andriy Myronenko. 3d mri brain tumor segmentation using autoencoder regularization. In International MICCAI Brainlesion Workshop, pages 311–320. Springer, Cham, 2018.
  • Ogasawara et al. (2021) Christian Ogasawara, Brandon D Philbrick, and D Cory Adamson. Meningioma: a review of epidemiology, pathology, diagnosis, treatment, and future directions. Biomedicines, 9(3):319, 2021.
  • Pati et al. (2022) Sarthak Pati, Ujjwal Baid, Brandon Edwards, Micah J Sheller, Patrick Foley, G Anthony Reina, Siddhesh Thakur, Chiharu Sako, Michel Bilello, Christos Davatzikos, et al. The federated tumor segmentation (fets) tool: an open-source solution to further solid tumor research. Physics in Medicine & Biology, 67(20):204002, 2022.
  • Pati et al. (2023) Sarthak Pati, Siddhesh P Thakur, İbrahim Ethem Hamamcı, Ujjwal Baid, Bhakti Baheti, Megh Bhalerao, Orhun Güley, Sofia Mouchtaris, David Lang, Spyridon Thermos, et al. Gandlf: the generally nuanced deep learning framework for scalable end-to-end clinical workflows. Communications Engineering, 2(1):23, 2023.
  • Pereira et al. (2016) Sérgio Pereira, Adriano Pinto, Victor Alves, and Carlos A Silva. Brain tumor segmentation using convolutional neural networks in mri images. IEEE transactions on medical imaging, 35(5):1240–1251, 2016.
  • Prasad et al. (2022) Rahul N Prasad, Haley K Perlow, Joseph Bovi, Steve E Braunstein, Jana Ivanidze, John A Kalapurakal, Christopher Kleefisch, Jonathan PS Knisely, Minesh P Mehta, Daniel M Prevedello, et al. 68ga-dotatate pet: the future of meningioma treatment. International Journal of Radiation Oncology, Biology, Physics, 113(4):868–871, 2022.
  • Rogers et al. (2020) C Leland Rogers, Minhee Won, Michael A Vogelbaum, Arie Perry, Lynn S Ashby, Jignesh M Modi, Anthony M Alleman, James Galvin, Shannon E Fogh, Emad Youssef, et al. High-risk meningioma: initial outcomes from nrg oncology/rtog 0539. International Journal of Radiation Oncology* Biology* Physics, 106(4):790–799, 2020.
  • Rogers et al. (2017) Leland Rogers, Peixin Zhang, Michael A Vogelbaum, Arie Perry, Lynn S Ashby, Jignesh M Modi, Anthony M Alleman, James Galvin, David Brachman, Joseph M Jenrette, et al. Intermediate-risk meningioma: initial outcomes from nrg oncology rtog 0539. Journal of neurosurgery, 129(1):35–47, 2017.
  • Rudie (2023) J. Rudie. Brats 2023 segmentation metrics: Clinical relevance. In Proceedings of the Medical Image Computing and Computer Assisted Intervention – MICCAI, Vancouver, Canada, 2023.
  • Salah et al. (2019) Florian Salah, A Tabbarah, K Asmar, H Tamim, M Makki, A Sibahi, R Hourani, et al. Can ct and mri features differentiate benign from malignant meningiomas? Clinical Radiology, 74(11):898–e15, 2019.
  • Saluja (2023) R. Saluja. Lesion-wise performance metrics for brats-2023 segmentation challenges. In Proceedings of the Medical Image Computing and Computer Assisted Intervention – MICCAI, Vancouver, Canada, 2023.
  • Schwarz et al. (2019) Christopher G Schwarz, Walter K Kremers, Terry M Therneau, Richard R Sharp, Jeffrey L Gunter, Prashanthi Vemuri, Arvin Arani, Anthony J Spychalla, Kejal Kantarci, David S Knopman, et al. Identification of anonymous mri research participants with face-recognition software. New England Journal of Medicine, 381(17):1684–1686, 2019.
  • Schwarz et al. (2021) Christopher G Schwarz, Walter K Kremers, Heather J Wiste, Jeffrey L Gunter, Prashanthi Vemuri, Anthony J Spychalla, Kejal Kantarci, Aaron P Schultz, Reisa A Sperling, David S Knopman, et al. Changing the face of neuroimaging research: comparing a new mri de-facing technique with popular alternatives. NeuroImage, 231:117845, 2021.
  • Smith (2002) Stephen M Smith. Fast robust automated brain extraction. Human brain mapping, 17(3):143–155, 2002.
  • Tang et al. (2022) Yucheng Tang, Dong Yang, Wenqi Li, Holger R. Roth, et al. Self-supervised pre-training of swin transformers for 3d medical image analysis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 20730–20740, 2022.
  • Thakur et al. (2020) Siddhesh Thakur, Jimit Doshi, Sarthak Pati, Saima Rathore, Chiharu Sako, Michel Bilello, Sung Min Ha, Gaurav Shukla, Adam Flanders, Aikaterini Kotrotsou, et al. Brain extraction on mri scans in presence of diffuse glioma: Multi-institutional performance evaluation of deep learning methods and robust modality-agnostic training. NeuroImage, 220:117081, 2020.
  • Thakur et al. (2019) Siddhesh P Thakur, Jimit Doshi, Sarthak Pati, Sung Min Ha, Chiharu Sako, Sanjay Talbar, Uday Kulkarni, Christos Davatzikos, Guray Erus, and Spyridon Bakas. Skull-stripping of glioblastoma mri scans using 3d deep learning. In International MICCAI Brainlesion Workshop, pages 57–68. Springer, 2019.
  • Theyers et al. (2021) Athena E Theyers, Mojdeh Zamyadi, Mark O’Reilly, Robert Bartha, Sean Symons, Glenda M MacQueen, Stefanie Hassel, Jason P Lerch, Evdokia Anagnostou, Raymond W Lam, et al. Multisite comparison of mri defacing software across multiple cohorts. Frontiers in psychiatry, 12:617997, 2021.
  • Wasserthal et al. (2022) J. Wasserthal, M. Meyer, H.C. Breit, J. Cyriac, S. Yang, and M. Segeroth. Totalsegmentator: robust segmentation of 104 anatomical structures in ct images. arXiv preprint arXiv:2208.05868, 2022.
  • Watts et al. (2014) J Watts, G Box, A Galvin, P Brotchie, N Trost, and T Sutherland. Magnetic resonance imaging of meningiomas: a pictorial review. Insights into imaging, 5(1):113–122, 2014.
  • Yushkevich et al. (2006) Paul A. Yushkevich, Joseph Piven, Heather Cody Hazlett, Rachel Gimpel Smith, Sean Ho, James C. Gee, and Guido Gerig. User-guided 3D active contour segmentation of anatomical structures: Significantly improved efficiency and reliability. Neuroimage, 31(3):1116–1128, 2006.