| Abstract [eng] |
This systematic review aims to comprehensively examine and compare deep learning methods for brain tumor segmentation and classification using MRI and other imaging modalities, focusing on recent trends from 2022 to 2025. The primary objective is to evaluate methodological advancements, model performance, dataset usage, and existing challenges in developing clinically robust AI systems. We included peer-reviewed journal articles and high-impact conference papers published between 2022 and 2025, written in English, that proposed or evaluated deep learning methods for brain tumor segmentation and/or classification. Excluded were non-open-access publications, books, and non-English articles. A structured search was conducted across Scopus, Google Scholar, Wiley, and Taylor & Francis, with the last search performed in August 2025. Risk of bias was not formally quantified but considered during full-text screening based on dataset diversity, validation methods, and availability of performance metrics. We used narrative synthesis and tabular benchmarking to compare performance metrics (e.g., accuracy, Dice score) across model types (CNN, Transformer, Hybrid), imaging modalities, and datasets. A total of 49 studies were included (43 journal articles and 6 conference papers). These studies spanned over 9 public datasets (e.g., BraTS, Figshare, REMBRANDT, MOLAB) and utilized a range of imaging modalities, predominantly MRI. Hybrid models, especially ResViT and UNetFormer, consistently achieved high performance, with classification accuracy exceeding 98% and segmentation Dice scores above 0.90 across multiple studies. Transformers and hybrid architectures showed increasing adoption post-2023. Many studies lacked external validation and were evaluated only on a few benchmark datasets, raising concerns about generalizability and dataset bias. Few studies addressed clinical interpretability or uncertainty quantification. Despite promising results, particularly for hybrid deep learning models, widespread clinical adoption remains limited due to lack of validation, interpretability concerns, and real-world deployment barriers. |