An empirical assessment of ML models for 5G network intrusion detection: a data leakage-free approach

This paper thoroughly compares thirteen unique Machine Learning (ML) models utilized for Intrusion detection systems (IDS) in a meticulously controlled environment. Unlike previous studies, we introduce a novel approach that meticulously avoids data leakage, enhancing the reliability of our findings...

Full description

Saved in:
Bibliographic Details
Main Authors: Bouke, Mohamed Aly, Abdullah, Azizol
Format: Article
Language:English
Published: Elsevier 2024
Online Access:http://psasir.upm.edu.my/id/eprint/113366/1/113366.pdf
http://psasir.upm.edu.my/id/eprint/113366/
https://linkinghub.elsevier.com/retrieve/pii/S2772671124001700
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.upm.eprints.113366
record_format eprints
spelling my.upm.eprints.1133662024-11-22T02:59:25Z http://psasir.upm.edu.my/id/eprint/113366/ An empirical assessment of ML models for 5G network intrusion detection: a data leakage-free approach Bouke, Mohamed Aly Abdullah, Azizol This paper thoroughly compares thirteen unique Machine Learning (ML) models utilized for Intrusion detection systems (IDS) in a meticulously controlled environment. Unlike previous studies, we introduce a novel approach that meticulously avoids data leakage, enhancing the reliability of our findings. The study draws upon a comprehensively labeled 5G-NIDD dataset covering a broad spectrum of network behaviors, from benign real-user traffic to various attack scenarios. Our data preprocessing and experimental design have been carefully structured to eradicate any data leakage, a standout feature of our methodology that significantly improves the robustness and dependability of our results compared to prior studies. The ML models are evaluated using various performance metrics, including accuracy, precision, recall, F1-score, ROC AUC, and execution time. Our results reveal that the K-Nearest Neighbors model is superior in accuracy and ROC AUC, while the Voting Classifier stands out in precision and F1-score. Decision Tree, Bagging, and Extra Trees models exhibit strong recall scores. In contrast, the AdaBoost model falls short across all assessed metrics. Despite displaying only modest performance on other metrics, the Naive Bayes model excels in computational efficiency, offering the quickest execution time. This paper emphasizes the importance of understanding various ML models' distinct strengths, drawbacks, and trade-offs for network intrusion detection. It highlights that no single model is universally superior, and the choice hinges on the nature of the dataset, specific application requirements, and the computational resources available. Elsevier 2024 Article PeerReviewed text en cc_by_4 http://psasir.upm.edu.my/id/eprint/113366/1/113366.pdf Bouke, Mohamed Aly and Abdullah, Azizol (2024) An empirical assessment of ML models for 5G network intrusion detection: a data leakage-free approach. e-Prime - Advances in Electrical Engineering, Electronics and Energy, 8. art. no. 100590. pp. 1-12. ISSN 2772-6711; eISSN: 2772-6711 https://linkinghub.elsevier.com/retrieve/pii/S2772671124001700 10.1016/j.prime.2024.100590
institution Universiti Putra Malaysia
building UPM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Putra Malaysia
content_source UPM Institutional Repository
url_provider http://psasir.upm.edu.my/
language English
description This paper thoroughly compares thirteen unique Machine Learning (ML) models utilized for Intrusion detection systems (IDS) in a meticulously controlled environment. Unlike previous studies, we introduce a novel approach that meticulously avoids data leakage, enhancing the reliability of our findings. The study draws upon a comprehensively labeled 5G-NIDD dataset covering a broad spectrum of network behaviors, from benign real-user traffic to various attack scenarios. Our data preprocessing and experimental design have been carefully structured to eradicate any data leakage, a standout feature of our methodology that significantly improves the robustness and dependability of our results compared to prior studies. The ML models are evaluated using various performance metrics, including accuracy, precision, recall, F1-score, ROC AUC, and execution time. Our results reveal that the K-Nearest Neighbors model is superior in accuracy and ROC AUC, while the Voting Classifier stands out in precision and F1-score. Decision Tree, Bagging, and Extra Trees models exhibit strong recall scores. In contrast, the AdaBoost model falls short across all assessed metrics. Despite displaying only modest performance on other metrics, the Naive Bayes model excels in computational efficiency, offering the quickest execution time. This paper emphasizes the importance of understanding various ML models' distinct strengths, drawbacks, and trade-offs for network intrusion detection. It highlights that no single model is universally superior, and the choice hinges on the nature of the dataset, specific application requirements, and the computational resources available.
format Article
author Bouke, Mohamed Aly
Abdullah, Azizol
spellingShingle Bouke, Mohamed Aly
Abdullah, Azizol
An empirical assessment of ML models for 5G network intrusion detection: a data leakage-free approach
author_facet Bouke, Mohamed Aly
Abdullah, Azizol
author_sort Bouke, Mohamed Aly
title An empirical assessment of ML models for 5G network intrusion detection: a data leakage-free approach
title_short An empirical assessment of ML models for 5G network intrusion detection: a data leakage-free approach
title_full An empirical assessment of ML models for 5G network intrusion detection: a data leakage-free approach
title_fullStr An empirical assessment of ML models for 5G network intrusion detection: a data leakage-free approach
title_full_unstemmed An empirical assessment of ML models for 5G network intrusion detection: a data leakage-free approach
title_sort empirical assessment of ml models for 5g network intrusion detection: a data leakage-free approach
publisher Elsevier
publishDate 2024
url http://psasir.upm.edu.my/id/eprint/113366/1/113366.pdf
http://psasir.upm.edu.my/id/eprint/113366/
https://linkinghub.elsevier.com/retrieve/pii/S2772671124001700
_version_ 1817844627848298496
score 13.223943