A recommendation system of training data selection method for cross-project defect prediction

Benyamin Langgu Sinaga, Sabrina Ahmad, Zuraida Abal Abas, Intan Ermahani A. Jalil

Abstract


Cross-project defect prediction (CPDP) has been a popular approach to address the limited historical dataset when building a defect prediction model. Directly applying cross-project datasets to learn the prediction model produces an unsatisfactory predictive model. Therefore, the selection of training data is essential. Many studies have examined the effectiveness of training data selection methods, and the best-performing method varied across datasets. While no method consistently outperformed the others across all datasets, predicting the best method for a specific dataset is essential. This study proposed a recommendation system to select the most suitable training data selection method in the CPDP setting. We evaluated the proposed system using 44 datasets, 13 training data selection methods, and six classification algorithms. The findings concluded that the recommendation system effectively recommends the best method to select training data.

Keywords


Cross-project defect prediction; Meta-learning; Recommendation system; Training data selection;

Full Text:

PDF


DOI: http://doi.org/10.11591/ijeecs.v27.i2.pp990-1006

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

The Indonesian Journal of Electrical Engineering and Computer Science (IJEECS)
p-ISSN: 2502-4752, e-ISSN: 2502-4760
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).

shopify stats IJEECS visitor statistics