Machine learning to improve the performance of anomaly-based network intrusion detection in big data

Siriporn Chimphlee, Witcha Chimphlee


With the rapid growth of digital technology communications are overwhelmed by network data traffic. The demand for the internet is growing every day in today's cyber world, raising concerns about network security. Big Data are a term that describes a vast volume of complicated data that is critical for evaluating network patterns and determining what has occurred in the network. Therefore, detecting attacks in a large network is challenging. Intrusion detection system (IDS) is a promising cybersecurity research field. In this paper, we proposed an efficient classification scheme for IDS, which is divided into two procedures, on the CSE-CIC-IDS-2018 dataset, data pre-processing techniques including under-sampling, feature selection, and classifier algorithms were used to assess and decide the best performing model to classify invaders. We have implemented and compared seven classifier machine learning algorithms with various criteria. This work explored the application of the random forest (RF) for feature selection in conjunction with machine learning (ML) techniques including linear regression (LR), k-Nearest Neighbor (k-NN), classification and regression trees (CART), Bayes, RF, multi layer perceptron (MLP), and XGBoost in order to implement IDSS. The experimental results show that the MLP algorithm in the most successful with best performance with evaluation matrix.


Class imbalance; CSE-CIC-IDS-2018; Feature selection; Machine learning; Network intrusion detection

Full Text:




  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

shopify stats IJEECS visitor statistics