Online Imbalanced Support Vector Machine for Phishing Emails Filtering

Xiaoqing Gu, Tongguang Ni, Wei Wang

Abstract


Phishing emails are a real threat to internet communication and web economy. In real-world emails datasets, data are predominately composed of ham samples with only a small percentage of phishing ones. Standard Support Vector Machine (SVM) could produce suboptimal results in filtering phishing emails, and it often requires much time to perform the classification for large data sets. In this paper, an online version of imbalanced SVM (OISVM) is proposed. First an email is converted into 20 features which are well selected based on its content and link characters. Second, OISVM is developed to optimize the classification accuracy and reduce computation time, which is used a novel method to adjust the separation hyperplane of imbalanced date sets and an online algorithm to make the retaining process much fast. Compared to the existing methods, the experimental results show that OISVM can achieve significantly using a proposed expressive evaluation method.

 

DOI : http://dx.doi.org/10.11591/telkomnika.v12i6.4562


Keywords


phishing emails; filtering; support vector machine; imbalanced date; online

Full Text:

PDF

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

The Indonesian Journal of Electrical Engineering and Computer Science (IJEECS)
p-ISSN: 2502-4752, e-ISSN: 2502-4760
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).

shopify stats IJEECS visitor statistics