Classification of polymorphic virus based on integrated features

Standard virus classification relies on the use of virus function, which is a small number of bytes written in assembly language. The addressable problem with current malware intrusion detection and prevention system is having difficulties in detecting unknown and multipath polymorphic computer viru...

Full description

Saved in:
Bibliographic Details
Main Authors: A Hamid, Isredza Rahmi, Subramaniam, Sharmila, Sutoyo, Edi, Abdullah, Zubaile
Format: Article
Language:English
Published: Insight - Indonesian Society for Knowledge and Human Development 2018
Subjects:
Online Access:http://eprints.uthm.edu.my/5013/1/AJ%202018%20%28816%29%20Classification%20of%20polymorphic%20virus%20based%20on%20integrated%20features.pdf
http://eprints.uthm.edu.my/5013/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Standard virus classification relies on the use of virus function, which is a small number of bytes written in assembly language. The addressable problem with current malware intrusion detection and prevention system is having difficulties in detecting unknown and multipath polymorphic computer virus solely based on either static or dynamic features. Thus, this paper presents a classification of polymorphic virus based on integrated features. The integrated feature is selected based on Information Gain rank value between static and dynamic features. Then, all datasets are tested on Naïve Bayes and Random Forest classifiers. We extracted 49 features from 700 polymorphic computer virus samples from Netherland Net Lab and VXHeaven, which includes benign and polymorphic virus function. We spilt the dataset based on 60% for training and 40% for testing. The performance metric of accuracy value, receiver operating characteristic and mean absolute error are compared between two algorithms in the experiment of static, dynamic and integrated features. Our proposed integrated features manage to achieve 98.5% of accuracy value using highest rank feature selection.