Enhancing phishing URL detection through comprehensive feature selection: a comparative analysis across diverse datasets
Abstract
Malicious attacks have developed a prominent risk to the safety of online users, with attackers employing increasingly sophisticated systems to deceive unsuspecting victims. This research focuses on the critical aspect of feature selection in optimizing phishing uniform resource locator (URL) detection system. Feature selection boosts machine learning (ML) and deep learning (DL) by picking vital attributes efficiently. This research paper provides a comprehensive examination of feature selection techniques using five diverse datasets. Various methods, including random forest (RF) select from model, SelectKBest with chi-square statistic, principal component analysis (PCA) and recursive feature elimination (RFE), were employed. The experiments, with a particular emphasis on PCA and fourth dataset, revealed that all four models RF, decision trees (DTs), XGBoost, and multilayer perceptron) achieved 100% accuracy in detecting phishing URL attacks. This underscores the efficacy of feature selection methods in enhancing to a deeper understanding of feature selection’s role in bolstering the effectiveness of phishing detection system across diverse datasets, highlighting the importance of leveraging techniques such as PCA for optimal results.
Keywords
Decision tree; Feature selection technique; Phishing attack; Random forest; URL classification; XGBoot and MLP
Full Text:
PDFDOI: http://doi.org/10.11591/ijeecs.v36.i2.pp1182-1188
Refbacks
- There are currently no refbacks.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Indonesian Journal of Electrical Engineering and Computer Science (IJEECS)
p-ISSN: 2502-4752, e-ISSN: 2502-4760
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).