Wahid, Kareem A.Kaffey, Zaphanlene Y.Farris, David P.Humbert-Vidan, LaiaMoreno, Amy C.Rasmussen, MathisRen, JintaoNaser, Mohamed A.Netherton, Tucker J.Korreman, StineBalakrishnan, GuhaFuller, Clifton D.Fuentes, DavidDohopolski, Michael J.2024-10-082024-10-082024Wahid, K. A., Kaffey, Z. Y., Farris, D. P., Humbert-Vidan, L., Moreno, A. C., Rasmussen, M., Ren, J., Naser, M. A., Netherton, T. J., Korreman, S., Balakrishnan, G., Fuller, C. D., Fuentes, D., & Dohopolski, M. J. (2024). Artificial intelligence uncertainty quantification in radiotherapy applications − A scoping review. Radiotherapy and Oncology, 201, 110542. https://doi.org/10.1016/j.radonc.2024.110542https://hdl.handle.net/1911/117932Background/purpose The use of artificial intelligence (AI) in radiotherapy (RT) is expanding rapidly. However, there exists a notable lack of clinician trust in AI models, underscoring the need for effective uncertainty quantification (UQ) methods. The purpose of this study was to scope existing literature related to UQ in RT, identify areas of improvement, and determine future directions. Methods We followed the PRISMA-ScR scoping review reporting guidelines. We utilized the population (human cancer patients), concept (utilization of AI UQ), context (radiotherapy applications) framework to structure our search and screening process. We conducted a systematic search spanning seven databases, supplemented by manual curation, up to January 2024. Our search yielded a total of 8980 articles for initial review. Manuscript screening and data extraction was performed in Covidence. Data extraction categories included general study characteristics, RT characteristics, AI characteristics, and UQ characteristics. Results We identified 56 articles published from 2015 to 2024. 10 domains of RT applications were represented; most studies evaluated auto-contouring (50 %), followed by image-synthesis (13 %), and multiple applications simultaneously (11 %). 12 disease sites were represented, with head and neck cancer being the most common disease site independent of application space (32 %). Imaging data was used in 91 % of studies, while only 13 % incorporated RT dose information. Most studies focused on failure detection as the main application of UQ (60 %), with Monte Carlo dropout being the most commonly implemented UQ method (32 %) followed by ensembling (16 %). 55 % of studies did not share code or datasets. Conclusion Our review revealed a lack of diversity in UQ for RT applications beyond auto-contouring. Moreover, we identified a clear need to study additional UQ methods, such as conformal prediction. Our results may incentivize the development of guidelines for reporting and implementation of UQ in RT.engExcept where otherwise noted, this work is licensed under a Creative Commons Attribution (CC BY) license. Permission to reuse, publish, or reproduce the work beyond the terms of the license or beyond the bounds of fair use or other exemptions to copyright law must be obtained from the copyright holder.Artificial intelligence uncertainty quantification in radiotherapy applications − A scoping reviewJournal article1-s2-0-S0167814024035205-mainhttps://doi.org/10.1016/j.radonc.2024.110542