Treffer: Fine tuned CatBoost machine learning approach for early detection of cardiovascular disease through predictive modeling.
Am J Prev Cardiol. 2022 Mar 15;10:100335. (PMID: 35342890)
Math Biosci Eng. 2024 Jan 29;21(2):2943-2969. (PMID: 38454714)
BMJ Open. 2021 Jul 23;11(7):e044779. (PMID: 34301649)
Diagnostics (Basel). 2024 Aug 28;14(17):. (PMID: 39272680)
Diabetes Metab Syndr. 2020 May - Jun;14(3):247-250. (PMID: 32247212)
Comput Math Methods Med. 2022 May 2;2022:6517716. (PMID: 35547562)
Int J Cardiol. 2025 Feb 1;420:132757. (PMID: 39615697)
Comput Med Imaging Graph. 2023 Dec;110:102313. (PMID: 38011781)
BMC Med Res Methodol. 2022 Dec 17;22(1):325. (PMID: 36528631)
Sci Rep. 2020 Sep 29;10(1):16057. (PMID: 32994452)
Am J Prev Cardiol. 2022 Apr 06;10:100342. (PMID: 35517870)
Comput Biol Med. 2025 Feb;185:109503. (PMID: 39647242)
Biology (Basel). 2023 Jan 11;12(1):. (PMID: 36671809)
Front Med (Lausanne). 2022 Jan 18;8:814566. (PMID: 35118099)
PLoS One. 2019 May 15;14(5):e0213653. (PMID: 31091238)
Sensors (Basel). 2023 Sep 07;23(18):. (PMID: 37765780)
Eur Heart J. 2019 Jun 21;40(24):1975-1986. (PMID: 30060039)
Diabetes Care. 2021 Jan;44(Suppl 1):S125-S150. (PMID: 33298421)
EClinicalMedicine. 2024 May 27;73:102660. (PMID: 38846068)
Diagn Progn Res. 2017 Dec 21;1:20. (PMID: 31093549)
Weitere Informationen
Cardiovascular disease (CVD) remains one of the leading causes of morbidity and mortality worldwide, highlighting the urgent need for early-stage diagnosis to improve clinical outcomes. Machine learning (ML) approaches have demonstrated substantial potential in predictive modeling for CVD risk assessment. In this study, we propose an advanced predictive model based on the CatBoost algorithm to classify various stages of CVD using hospital records as the primary data source. The dataset, sourced from a publicly available repository, comprises 12 key predictor variables. The proposed methodology incorporates feature selection, rigorous validation processes, and data augmentation to enhance predictive performance and address the challenges associated with high-dimensional medical data. Among several ML algorithms evaluated, the fine-tuned CatBoost model achieved the highest performance, automating feature selection and facilitating the detection of early-stage heart disease. The model attained an impressive F1-score of 99% and an overall accuracy of 99.02%, outperforming existing ML-based approaches. These findings underscore the potential of the CatBoost algorithm for rapid and accurate CVD diagnosis, thereby supporting clinical decision-making. Future work will focus on external validation and testing on independent datasets to further assess the model's generalizability and clinical applicability.
(© 2025. The Author(s).)
Declarations. Competing interests: The authors declare no competing interests.