Phishing website detection using novel integration of BERT and XLNet with deep learning sequential models
Abstract
Phishing websites pose a significant threat to online security, necessitating robust detection mechanisms to safeguard users' sensitive information. This study explores the efficacy of various deep learning architectures for phishing website detection. Initially, traditional sequential models, including recurrent neural networks (RNN), long short-term memory (LSTM), and gated recurrent unit (GRU), achieve accuracies of 95%, 96%, and 96.5%, respectively, on a curated dataset. Building upon these results, hybrid architectures that combine the strengths of traditional sequential models with state-of-the-art language representation models, bidirectional encoder representations from transformers (BERT) and XLNet, are investigated. Combinations such as RNN with BERT, BERT with LSTM, BERT with GRU, RNN with XLNet, XLNet with LSTM, and XLNet with GRU are evaluated. Through experimentation, accuracies of 94.5%, 96.5%, 96.1%, 95.7%, 97.4%, and 97%, respectively, are achieved, demonstrating the effectiveness of hybrid deep learning architectures in enhancing phishing detection performance. These findings contribute to advancing the state-of-the-art in cybersecurity practices and underscore the importance of leveraging diverse model types to combat online threats effectively.
Keywords
BERT; GRU; LSTM; Phishing website; RNN; XLNet
Full Text:
PDFDOI: http://doi.org/10.11591/ijeecs.v36.i2.pp1273-1283
Refbacks
- There are currently no refbacks.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Indonesian Journal of Electrical Engineering and Computer Science (IJEECS)
p-ISSN: 2502-4752, e-ISSN: 2502-4760
This journal is published by the Institute of Advanced Engineering and Science (IAES).