Comparative analysis of machine learning models for breast cancer prediction and diagnosis: a dual-dataset approach

Muhammad Zeerak Awan, Muhammad Shoaib Arif, Mirza Zain Ul Abideen, Kamaleldin Abodayeh


Breast cancer is ranked as a significant cause of mortality among females globally. Its complex nature poses principal challenges for physicians and researchers for rapid diagnosis and prognosis. Hence, machine learning algorithms are employed to forecast and identify diseases. This study discusses the comparative analysis of seven machine learning models, e.g., logistic regression (LR), support vector machine (SVM), k-nearest neighbor classifier (KNN), decision tree classifier (DT), random forest classifier (RF), Naïve Bayes (NB), and artificial neural network (ANN) to predict breast cancer using Wisconsin breast cancer and breast cancer datasets. In the Wisconsin breast cancer dataset, KNN depicted 99% accuracy, followed by RF (98%), SVM (96%), NB (96%), LR (96%), ANN (93%), and DT (92%). On the contrary, in the breast cancer (BC) dataset, the highest accuracy was achieved by LR at 83%, and the lowest was achieved by DT (65%), which depicted that the numeric dataset WBC has better accuracy than the breast cancer dataset.


Breast cancer; Data mining; Datasets; Machine learning; Model evaluation

Full Text:




  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

The Indonesian Journal of Electrical Engineering and Computer Science (IJEECS)
p-ISSN: 2502-4752, e-ISSN: 2502-4760
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).

shopify stats IJEECS visitor statistics