Heart disease prediction using ML through enhanced feature engineering with association and correlation analysis
Abstract
Heart disease remains a prevalent and critical health concern globally. This paper addresses the critical task of heart disease prediction through the utilization of advanced machine learning techniques. Our approach focuses on the enhancement of feature engineering by incorporating a novel integration of association and correlation analyses. A heart disease dataset from Kaggle was used for the experiments. Association analysis was applied to the categorical and binary features in the dataset. Correlation analysis was applied to the numerical features in the dataset. Based on the insights from association analysis and correlation analysis, a new dataset was created with combinations of features. Later, newly created features are integrated with the original dataset, and classification algorithms are applied. Five machine learning (ML) classifiers, namely decision tree, k-nearest neighbors (KNN), random forest, XG-Boost, and support vector machine (SVM), were applied to the final dataset and achieved a good accuracy rate for heart disease detection. By systematically exploring associations and relationships with categorical, binary, and numerical features, this paper unveils innovative insights that contribute to a more comprehensive understanding of the heart disease dataset.
		Keywords
Association analysis; Correlation analysis; Feature engineering; Heart disease; Kaggle; Machine learning
		Full Text:
PDFDOI: http://doi.org/10.11591/ijeecs.v34.i2.pp1122-1130
Refbacks
- There are currently no refbacks.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Indonesian Journal of Electrical Engineering and Computer Science (IJEECS)
p-ISSN: 2502-4752, e-ISSN: 2502-4760
This journal is published by the Institute of Advanced Engineering and Science (IAES).
