Prediction of chronic diseases based on ML packages using spark MLlib

Aicha Oussous, Abderrahmane Ez-Zahout, Soumia Ziti

Abstract


Heart disease, diabetes, and breast cancer pose significant global health challenges, and effectively addressing these chronic diseases necessitates a coordinated international effort. The integration of machine learning and predictive analytics offers promising solutions for tackling these issues. Our study presents a unified model that utilizes the random forest (RF) algorithm and SparkMLlib to predict these three diseases, testing the model on three distinct datasets and evaluating its performance using scientific metrics, including the receiver operating characteristic (ROC) curve, accuracy, precision, recall, and F1-score. Furthermore, we aim to investigate whether variations in medical data and contextual factors impact the results. The findings indicate that while the model shows strong overall performance, its effectiveness may differ for each disease due to factors such as data characteristics, disease-specific features, model behavior, and various biological and medical considerations; understanding these factors is essential for improving model performance and ensuring its appropriate use in clinical environments.

Keywords


Apache spark; Breast cancer disease; Chronic diseases; Diabetes disease; Heart disease; Random forest algorithm; SparkMLlib

Full Text:

PDF


DOI: http://doi.org/10.11591/ijeecs.v37.i2.pp1121-1129

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Indonesian Journal of Electrical Engineering and Computer Science (IJEECS)
p-ISSN: 2502-4752, e-ISSN: 2502-4760
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).

shopify stats IJEECS visitor statistics