Machine learning models in the enhancement of PSE in high-dimensional socioeconomic data: a review
Abstract
This study reviews the use of machine learning (ML) techniques to improve propensity score (PS) estimation in high-dimensional socioeconomic data. Traditional logistic regression (LR) often performs poorly under nonlinear and complex covariate structures, leading to bias and model misspecification. Across the reviewed studies, ensemble methods such as random forests (RF) and gradient boosting, and deep learning models consistently achieved better covariate balance, lower bias, and greater flexibility than conventional approaches, while classification-based methods improved performance in imbalanced datasets. The review also highlights practical considerations, including calibration, transparent reporting, and integration with doubly robust estimators to strengthen causal inference. The findings show that ML-based propensity score estimation (PSE) can substantially enhance the validity and reliability of socioeconomic evaluations, provided that its implementation is carefully guided by appropriate expertise and best-practice standards.
Keywords
Ensemble model; High-dimensional data; Machine learning models; Socioeconomic evaluation; Systematic review
Full Text:
PDFDOI: http://doi.org/10.11591/ijeecs.v41.i2.pp645-654
Refbacks
- There are currently no refbacks.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Indonesian Journal of Electrical Engineering and Computer Science (IJEECS)
p-ISSN: 2502-4752, e-ISSN: 2502-4760
This journal is published by the Institute of Advanced Engineering and Science (IAES).