Machine learning models in the enhancement of PSE in high-dimensional socioeconomic data: a review

Gene Marck B. Catedrilla, Joey Aviles

Abstract


This study reviews the use of machine learning (ML) techniques to improve propensity score (PS) estimation in high-dimensional socioeconomic data. Traditional logistic regression (LR) often performs poorly under nonlinear and complex covariate structures, leading to bias and model misspecification. Across the reviewed studies, ensemble methods such as random forests (RF) and gradient boosting, and deep learning models consistently achieved better covariate balance, lower bias, and greater flexibility than conventional approaches, while classification-based methods improved performance in imbalanced datasets. The review also highlights practical considerations, including calibration, transparent reporting, and integration with doubly robust estimators to strengthen causal inference. The findings show that ML-based propensity score estimation (PSE) can substantially enhance the validity and reliability of socioeconomic evaluations, provided that its implementation is carefully guided by appropriate expertise and best-practice standards.

Keywords


Ensemble model; High-dimensional data; Machine learning models; Socioeconomic evaluation; Systematic review

Full Text:

PDF


DOI: http://doi.org/10.11591/ijeecs.v41.i2.pp645-654

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Indonesian Journal of Electrical Engineering and Computer Science (IJEECS)
p-ISSN: 2502-4752, e-ISSN: 2502-4760
This journal is published by the Institute of Advanced Engineering and Science (IAES).

shopify stats IJEECS visitor statistics