Double layer machine learning for network intrusion detection system on web server

Muhammad Hafiz Amrullah, Favian Dewanta, Muhamad Erza Aminanto

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

Abstract

Nowadays, web application networks are experiencing rapid growth. Consequently, cybercriminals are launching more aggressive attacks on these networks. Intrusion detection systems, also known as IDS, extensively utilize pattern-matching methods based on signatures. This capability enables the systems to identify a wide variety of network-based attacks. To accurately detect anomalies or attacks, machine learning classifiers significantly enhance the robust performance of IDS compared to pattern-matching approaches based solely on packet features, such as packet lengths, flow duration, flags, and other characteristics. However, the accuracy of a single machine learning classifier method is relatively low in detecting a particular kind of attack due to the existence of different attack patterns. This research proposes a double layer machine learning approach based on the Random Forest and KNN algorithms. The aim is to identify the two most common types of attacks on web servers, namely DOS/DDOS and brute force attacks. Two distinct ML models were developed in parallel for IDS on web servers. The initial layer of the ML model comprises a 3-class classification approach using a random forest algorithm, enabling the identification of network records from the dataset as belonging to DoS, DDoS, and normal classes. The second layer of the ML model is constructed using KNN, which categorizes network records from the dataset into four classes, namely FTP-Patator, SSH-Patator, Web Brute Force, or normal. The selected features can significantly reduce both the training processing time and the prediction processing time. Based on the simulation results, the first layer, utilizing the random forest algorithm, achieved the best metrics with an accuracy of 0.9994 when using 40 features. On the other hand, the second layer obtained the best metrics with an accuracy of 0.9945 when using 64 features, but also performed well with 40 features.

Original languageEnglish
Title of host publication2023 10th International Conference on Information Technology, Computer, and Electrical Engineering, ICITACEE 2023
EditorsYosua Alvin Adi Soetrisno, Patricia Evericho Mountaines (
Place of PublicationPiscataway NJ USA
PublisherIEEE, Institute of Electrical and Electronics Engineers
Pages281-286
Number of pages6
ISBN (Electronic)9798350322729
DOIs
Publication statusPublished - 2023
EventInternational Conference on Information Technology, Computer, and Electrical Engineering 2023 - Virtual, Online, Indonesia
Duration: 31 Aug 20231 Sept 2023
Conference number: 10th
https://ieeexplore.ieee.org/xpl/conhome/10275798/proceeding (Proceedings)
https://icitacee.undip.ac.id/2023/ (Website)

Conference

ConferenceInternational Conference on Information Technology, Computer, and Electrical Engineering 2023
Abbreviated titleICITACEE 2023
Country/TerritoryIndonesia
CityVirtual, Online
Period31/08/231/09/23
Internet address

Keywords

  • double layer
  • KNN
  • machine learning
  • random forest

Cite this