Comparison on machine learning algorithm to fast detection of malicious web pages

The advance growth of technology today, has simply brought to the increasing of the Internet and online activities such as in business, management, education and other areas. However, this has exposed to the malicious activities and online threat to the users which will disrupt the method of every s...

Full description

Saved in:
Bibliographic Details
Main Authors: Wan Nurul Safawati, Wan Manan, Mohd Nizam, Mohmad Kahar, Noorlin, Mohd Ali
Format: Conference or Workshop Item
Language:en
en
Published: IEEE 2021
Subjects:
Online Access:https://umpir.ump.edu.my/id/eprint/33329/1/Comparison%20on%20machine%20learning%20algorithm%20to%20fast%20detection_FULL.pdf
https://umpir.ump.edu.my/id/eprint/33329/2/Comparison%20on%20machine%20learning%20algorithm%20to%20fast%20detection.pdf
https://umpir.ump.edu.my/id/eprint/33329/
https://10.1109/ICSECS52883.2021.00086
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The advance growth of technology today, has simply brought to the increasing of the Internet and online activities such as in business, management, education and other areas. However, this has exposed to the malicious activities and online threat to the users which will disrupt the method of every system and performance. Accurate and fast detection on such threats in a timely manner is vital. Furthermore, the scale of the web pages (i.e huge data) and with heterogeneous nature of the web itself has complicate the detection process, and unable to detect accurately. Defining features in detecting malicious web pages is a vital condition in order to generate accurate detection result. These may lead to misdetection of malicious web as it only focus on certain criteria of feature selection. Furthermore, previous approached have used blacklist technique which a conventional method and have shown promising result in detecting malicious webpages. Therefore, implementing the principle of the machine learning, which is training the classification algorithm will be perform to improve the detection accuracy. Output will be evaluated using correctly classified instances and incorrectly classified instances. The WEKA (Waikato Environment for Knowledge Analysis) will be used for testing and generating the comparison output. Selected dataset from well-known resources will be used based on identified features in order to verify the web pages form legitimates ones. Compared to several decision tree method, Random Forest has shown promising and higher sensitivity result towards malicious data which is 98.3% compared to other classification algorithm