Prediction of Cardiovascular Disease Based on Voting Ensemble Model and SHAP Analysis
Year 2023,
, 226 - 238, 31.12.2023
Erkan Akkur
Abstract
Cardiovascular Diseases (CVD) or heart diseases cardiovascular diseases lead the list of fatal diseases. However, the treatment of this disease involves a time-consuming process. Therefore, new approaches are being developed for the detection of such diseases. Machine learning methods are one of these new approaches. In particular, these algorithms contribute significantly to solving problems such as predictions in various fields. Given the amount of clinical data currently available in the medical field, it is useful to use these algorithms in areas such as CVD prediction. This study proposes a prediction model based on voting ensemble learning for the prediction of CVD. Furthermore, the SHAP technique is utilized to interpret the suggested prediction model including the risk factors contributing to the detection of this disease. As a result, the suggested model depicted an accuracy of 0.9534 and 0.954 AUC-ROC score for CVD prediction. Compared to similar studies in the literature, the proposed prediction model provides a good classification rate.
Ethical Statement
HEART DISEASE DATASET (COMPREHENSIVE) açık erişimli datası kullanılmıştır.https://ieee-dataport.org/open-access/heart-disease-dataset-comprehensive internet sitesinden veriye erişilebilmektedir. Bu nedenle, etik kurul alnımasına gerek yoktur.
Supporting Institution
Herhangi bir kurumdan destek alınmamıştır.
Thanks
Çalışmada ‘HEART DISEASE DATASET (COMPREHENSIVE) ' veri setini açık kaynak erişimli internet sitesine (https://ieee-dataport.org/open-access/heart-disease-dataset-comprehensive
aktaran kişi/kişilere teşekkürlerimizi sunarız.
References
- [1] F. Coronado, S. C. Melvin, R. A Bell and G. Zhao, “Global Responses to Prevent, Manage, and Control Cardiovascular Diseases.” Prev Chronic Dis, 2022, 8:19:E84.
- [2] R. Hajar, “Risk Factors for Coronary Artery Disease: Historical Perspectives.” Heart Views, 2017; 18(3), 109-114.
- [3] J Azmi, M. Arif, M.T. Nafis, M. A. Alam, S. Tanweer, G. Wang, “A systematic review on machine learning approaches for cardiovascular disease prediction using medical big data.” Medical Engineering & Physics, 2022, 105, 103825.
- [4] K. P. Kresoja, M. Unterhuber, R. Wachter, H. Thiele, P. Lurz, “A cardiologist’s guide to machine learning in cardiovascular disease prognosis prediction.” Basic research in cardiology, 2023, 118(1), 10.
- [5] S. Mohapatra, S. Maneesha, S. Mohanty, P. K. Patra, S.K. Bhoi, K. S. Sahoo and A.H. Gandomi. “A stacking classifiers model for detecting heart irregularities and predicting cardiovascular disease.” Healthcare Analytics, 2023, 3, 100133.
- [6] I.D. Mienye and Y. Sun, “A survey of ensemble learning: Concepts, algorithms, applications, and prospects.” IEEE Access, 2022, 10, 99129-99149
- [7] K.Wang et al. “Interpretable prediction of 3-year all-cause mortality in patients with heart failure caused by coronary heart disease based on machine learning and SHAP.” Computers in Biology and Medicine, 2021, 137, 104813.
- [8] M. Ahsan and Z. Siddique, “Machine learning-based heart disease diagnosis: A systematic literature review.”, Artificial Intelligence in Medicine, 2022, 128, 102289.
- [9] Sangya W., Shanu KR, C. Bharat., Heart Attack Prediction by using Machine Learning Techniques. International Journal of Recent Technology and Engineering 2020;8(5):1577–80.
- [10] D. Shah, S. Patel, S.K. Bharti. “Heart disease prediction using machine learning techniques.” SN COMPUT. SCI. 2020, 1:345.
- [11] Rajdhan A, Agarwal A, Sai M, Ravi D, Ghuli P. Heart disease prediction using machine learning. International Journal of Research and Technology 2020;9(04): 659–62.
- [12] Poorani S, Hemalatha D. Machine Learning Techniques for Heart Disease Prediction. Journal of Cardiovascular Disease Research 2021;12(1):93–6.
- [13] O. Ozhan and Z. Kuçukakcali, “Estimation of risk factors related to heart attack with XGBoost that machine learning model.” Middle Black Sea Journal of Health Science, 2022, 8(4), 582-591.
- [14] T. Das and B. B. Sinha, "A comprehensive study on machine learning methods for predicting heart disease: a comparative analysis," 8th International Conference on Computing in Engineering and Technology (ICCET 2023), Hybrid Conference, Patna, India, 2023, pp. 205-210.
- [15] K. Akyol and U. Atilla, “A study on performance improvement of heart disease prediction by attribute selection methods.”, Academic Platform Journal of Engineering and Science, 2019; 7-2, 174-179.
- [16] M. Jan, AA Awan, MS Khalid & Salman Nisar, Ensemble approach for developing a smart heart disease prediction system using classification algorithms, Research Reports in Clinical Cardiology, 2018; 9: 33-45.
- [17] A. Tiwari, A. Chugh, A. Sharma, “Ensemble framework for cardiovascular disease prediction.” Computers in Biology and Medicine, 2022, 146, 105624.
- [18] R. Yilmaz and F.H. Yagin, “Early detection of coronary heart disease based on machine learning methods.” Medical Records, 2022, 4(1), 1-6.
- [19] BP. Doppala, D. Bhattacharyya D, M. Janarthanan, N. Baik, “A reliable machine intelligence model for accurate identification of cardiovascular diseases using ensemble techniques.” J Healthc Eng. 2022, 8:2022:2585235
- [20] MT. García-Ordás, M. Bayón-Gutiérrez, C. Benavides et al. “Heart disease risk prediction using deep learning techniques with feature augmentation.”, Multimed Tools Appl 2023, 82, 31759–31773.
- [21] M. Siddhartha, November 5, 2020, "Heart Disease Dataset (Comprehensive)", IEEE Dataport, doi: https://dx.doi.org/10.21227/dz4t-cm36. (Accessed -10.09.2023).
- [22] S. Garcia, S. Ramírez-Gallego, J. Luengo, J.M. Benítez & F. Herrera, “Big data preprocessing: methods and prospects.”, Big Data Analytics, 2016, 1(1), 1-22.
- [23] SGK Patro and KK Sahu, Normalization: A preprocessing stage. arXiv preprint arXiv: 2015, 1503.06462.
- [24] BC. Haarman, RF. Riemersma-Van der Lek, WA Nolen, R. Mendes, HA. Drexhage, H. Burger. “Feature-expression heat maps--a new visual method to explore complex associations between two variable sets.” J Biomed Inform. 2015, 53:156-61.
- [25] S. Tewari, U.D. Dwivedi. “A comparative study of heterogeneous ensemble methods for the identification of geological lithofacies.” J Petrol Explor Prod Technol. 2020, 10, 1849–1868.
- [26] N. Chandrasekhar, S. Peddakrishna, “Enhancing heart disease prediction accuracy through machine learning techniques and optimization.” Processes 2023, 11, 1210.
- [27] Y. Xie, C. Zhu, W. Zhou, Z. Li, X. Liu, T. Tu. “Evaluation of machine learning methods for formation lithology identification: a comparison of tuning process and model performance.” J Pet Sci Eng 2018, 60:182–193.
- [28] D.M. Belete, M. D. Huchaiah. “Grid search in hyperparameter optimization of machine learning models for prediction of HIV/AIDS test results.”, International Journal of Computers and Applications, 2022, 44:9, 875-886.
- [29] S.M. Lundberg and S.I. Lee, “A unified approach to interpreting model predictions.” Advances in neural information processing systems, 2017, 30.