Optimizing random forest classifier with Jenesis-index on an imbalanced dataset

Joylin Zeffora, Shobarani Shobarani

Abstract


RandomĀ forest is an ensemble algorithm for machine learning. In decision trees, the splitting criteria is built on the prediction of the nodal points and formation of rules by Gini index and Information Gain. Gini index is a measure of inequality. Gini index does not take into consideration the structural changes in the dataset, and inaccurate data can distort the validity of the gini-coefficient. For data with the same feature but different outcomes, the gini-coefficient remained the same. The proposed method for attribute selection measure takes into consideration that there may be structural changes in the dataset overtime and it adapts to such expected changes and maintain the accuracy of the algorithm avoiding under-fitting and over-fitting. A dataset on myocardial infarctions was taken for the study and the results were promising.

Keywords


Gini coefficient; Gini index; Myocardial infarctions; Random forest;

Full Text:

PDF


DOI: http://doi.org/10.11591/ijeecs.v26.i1.pp505-511

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

The Indonesian Journal of Electrical Engineering and Computer Science (IJEECS)
p-ISSN: 2502-4752, e-ISSN: 2502-4760
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).

shopify stats IJEECS visitor statistics