Treffer: Quantitative Evaluation of Artificial Intelligence-Based Organ Segmentation Across Multiple Anatomic Sites Using 8 Commercial Software Platforms.
Radiother Oncol. 2015 Dec;117(3):438-41. (PMID: 26427804)
Med Phys. 2021 May;48(5):2426-2437. (PMID: 33655564)
Semin Radiat Oncol. 2022 Oct;32(4):421-431. (PMID: 36202444)
Radiother Oncol. 2024 Aug;197:110345. (PMID: 38838989)
Front Oncol. 2023 Aug 04;13:1213068. (PMID: 37601695)
J Appl Clin Med Phys. 2025 Apr;26(4):e14620. (PMID: 39837647)
Am J Clin Oncol. 2007 Apr;30(2):191-8. (PMID: 17414470)
Phys Imaging Radiat Oncol. 2022 Jan 24;21:11-17. (PMID: 35111981)
Med Phys. 2024 Mar;51(3):2175-2186. (PMID: 38230752)
Radiat Oncol. 2012 Mar 13;7:32. (PMID: 22414264)
Phys Imaging Radiat Oncol. 2019 Dec 17;13:1-6. (PMID: 33458300)
Clin Oncol (R Coll Radiol). 2022 Feb;34(2):128-134. (PMID: 34906407)
Med Phys. 2019 Jul;46(7):3133-3141. (PMID: 31050804)
Phys Imaging Radiat Oncol. 2022 Nov 08;24:121-128. (PMID: 36405563)
Int J Radiat Oncol Biol Phys. 2009 Mar 1;73(3):944-51. (PMID: 19215827)
Radiother Oncol. 2020 Dec;153:55-66. (PMID: 32920005)
Radiat Oncol. 2024 May 31;19(1):69. (PMID: 38822385)
Adv Radiat Oncol. 2023 Jan 16;8(3):101177. (PMID: 36865668)
Radiother Oncol. 2021 Jul;160:185-191. (PMID: 33984348)
Med Phys. 2022 Apr;49(4):2570-2581. (PMID: 35147216)
Phys Med Biol. 2020 Mar 31;65(7):07NT01. (PMID: 32079002)
Eur Radiol. 2019 Mar;29(3):1391-1399. (PMID: 30194472)
Sci Rep. 2020 Apr 10;10(1):6204. (PMID: 32277135)
Int J Radiat Oncol Biol Phys. 2021 Mar 15;109(4):1096-1110. (PMID: 33181248)
Med Phys. 2020 Aug;47(8):3415-3422. (PMID: 32323330)
Int J Radiat Oncol Biol Phys. 2012 Jul 1;83(3):e353-62. (PMID: 22483697)
Phys Eng Sci Med. 2025 Mar;48(1):301-316. (PMID: 39804550)
Ecancermedicalscience. 2020 Jan 06;14:996. (PMID: 32153651)
NPJ Digit Med. 2021 Mar 5;4(1):43. (PMID: 33674717)
Med Phys. 2019 May;46(5):2169-2180. (PMID: 30830685)
Med Phys. 2023 Jul;50(7):4079-4091. (PMID: 37287322)
Int J Radiat Oncol Biol Phys. 2024 May 1;119(1):261-280. (PMID: 37972715)
J Med Imaging (Bellingham). 2019 Oct;6(4):044009. (PMID: 31903406)
Radiother Oncol. 2021 Jun;159:1-7. (PMID: 33667591)
Phys Imaging Radiat Oncol. 2023 Oct 13;28:100501. (PMID: 37920450)
Med Phys. 2018 Oct;45(10):4568-4581. (PMID: 30144101)
Med Phys. 2020 Sep;47(9):4294-4302. (PMID: 32648602)
Weitere Informationen
Purpose: This study aims to evaluate organs-at-risk (OARs) segmentation variability across 8 commercial artificial intelligence (AI)-based segmentation software using independent multi-institutional data sets, and to provide recommendations for clinical practices using AI-segmentation.
Methods and Materials: A total of 160 planning computed tomography image sets from 4 anatomic sites: head and neck, thorax, abdomen, and pelvis were retrospectively pooled from 3 institutions. Contours for 31 OARs generated by the software were compared to clinical contours using multiple accuracy metrics, including: dice similarity coefficient (DSC), 95 percentile of Hausdorff distance, surface DSC, as well as relative added path length as an efficiency metric. A 2-factor analysis of variance was used to quantify variability in contouring accuracy across software platforms (intersoftware) and patients (interpatient). Pairwise comparisons were performed to categorize the software into different performance groups, and intersoftware variations were calculated as the average performance differences between the groups.
Results: Significant intersoftware and interpatient contouring accuracy variations (P < .05) were observed for most OARs. The largest intersoftware variations in DSC in each anatomic region were cervical esophagus (0.41), trachea (0.10), spinal cord (0.13), and prostate (0.17). Among the organs evaluated, 7 had mean DSC >0.9 (ie, heart, liver), 15 had DSC ranging from 0.7 to 0.89 (ie, parotid, esophagus). The remaining organs (ie, optic nerves, seminal vesicle) had DSC<0.7. Of the 31 organs, 16 (52%) had relative added path length less than 0.1.
Conclusions: Our results reveal significant intersoftware and interpatient variability in the performance of AI-segmentation software. These findings highlight the need of thorough software commissioning, testing, and quality assurance across disease sites, patient-specific anatomies, and image acquisition protocols.
(Copyright © 2025 American Society for Radiation Oncology. All rights reserved.)