A recommendation system of training data selection method for cross-project defect prediction

Benyamin Langgu Sinaga, Sabrina Ahmad, Zuraida Abal Abas, Intan Ermahani A. Jalil


Cross-project defect prediction (CPDP) has been a popular approach to address the limited historical dataset when building a defect prediction model. Directly applying cross-project datasets to learn the prediction model produces an unsatisfactory predictive model. Therefore, the selection of training data is essential. Many studies have examined the effectiveness of training data selection methods, and the best-performing method varied across datasets. While no method consistently outperformed the others across all datasets, predicting the best method for a specific dataset is essential. This study proposed a recommendation system to select the most suitable training data selection method in the CPDP setting. We evaluated the proposed system using 44 datasets, 13 training data selection methods, and six classification algorithms. The findings concluded that the recommendation system effectively recommends the best method to select training data.


Cross-project defect prediction; Meta-learning; Recommendation system; Training data selection;

Full Text:


DOI: http://doi.org/10.11591/ijeecs.v27.i2.pp990-1006


  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

shopify stats IJEECS visitor statistics