An integrated anomaly intrusion detection scheme using statistical, hybridized classifiers and signature approach

Intrusion detection systems (IDSs) effectively balance additional security in a computer system by identifying intrusive activities on a computer system, and their enhancements are developing at a surprising rate. Detection methods based on statistical and data mining techniques are widely deploye...

Full description

Saved in:
Bibliographic Details
Main Author: Mohamed Yassin, Warusia
Format: Thesis
Language:English
Published: 2015
Online Access:http://psasir.upm.edu.my/id/eprint/65260/1/FSKTM%202015%2043IR.pdf
http://psasir.upm.edu.my/id/eprint/65260/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Intrusion detection systems (IDSs) effectively balance additional security in a computer system by identifying intrusive activities on a computer system, and their enhancements are developing at a surprising rate. Detection methods based on statistical and data mining techniques are widely deployed as anomaly-based detection system (ADS). Although the statistical-based anomaly detection (SAD) method fascinates researchers, the low attack detection rates (also known as the detection of true positive) that reflect the effectiveness of the detection system generally persist. Specifically, this is due to the packets affected by the outlier data points (i.e., the data points that have a huge dissimilarity with the common data points) and the defined threshold size that is usually performed without any further analysis on the observed packet. It provides a significant effect in the process to determine which packet is more likely attributes to the anomalous behaviour. In recent years, data mining based anomaly detection (DMAD), particularly classification methods, have been incessantly enhanced in differentiating normal and attack behaviour. Unfortunately, in such methods the outcomes, i.e., true positive, true negative, false positive and false negative detections that directly influence the rates of accuracy, detection, and false alarms are not much improved and thus raise a persistent problem in the employment of such systems. The specific drawback that causes this is the failure to differentiate the packets behaviour that resembles a similar behaviour more precisely, such as a normal behaviour having a similar anomalous content behaviour and vice versa. These inaccurate outcomes can compromise the reliability of IDSs and cause them to overlook the attacks. As ADS can process massive volumes of packets, the amount of processing time needed to discover the pattern of the packets is also increased accordingly and resulting in late detection of the attack packets. The main contributor for such a shortcoming is the need to re-compute every process for each packet despite the attack behaviour having been examined. This study aims to improve the detection of an anomalous behaviour by identifying the outlier data points in the packets more precisely, maximizes the detection of packets with similar behaviours more accurately while reducing the detection time. An Integrated Anomaly Detection Scheme ( IADS) is proposed to overcome the aforesaid drawbacks. The proposed scheme integrates an ADS and signature-based detection system (SDS) approach for better and rapid intrusion detection. Therefore, Statisticalbased Packet Header Anomaly Detection (SPHAD) and a hybridized Naive Bayes and Random Forest classifier (NB+RF) are considered for the ADS, and Signature-based Packet Header Intrusion Detection (SPHID) is proposed as the SDS. In SPHAD,statistical analysis is used to construct a normal profile using statistical formula, scoring the incoming packets, and computing the relationships between historic normal behaviour as a dependent variable against observable packet behaviours as the independent variable through linear regression. Then the threshold measurement (size) is defined based on R2 and Cohen’s-d values in order to improve the attack detection rate by identifying a set of outlier data points which are present inside the packets more precisely. Subsequently, NB+RF, a hybrid classification algorithm is used to distinguish similar and dissimilar content behaviours of a packet. The Naive Bayes (NB) classifier is employed to construct the values of the posterior and the prior probability of a packet, then this information as well as the header values and statistical analysis information are fed to the Random Forest (RF) classifier to improve the detection of actual attacks and normal packets. SPHID then extracts the distinct behaviour of the packets which are verified as attacks by NB+RF and compute it as attack signatures for faster future detections, as the detection time will be reduced for the attack whose signature is already included in the signature database. The effectiveness of the IADS has been evaluated under different detection capabilities (i.e., false positive, false negative, true positive, true negative, false alarm, accuracy,detection rate, attack data detection rate, normal data detection rate) and detection times using the DARPA 1999 and ISCX 2012 intrusion detection benchmark datasets as well as with Live-data. Results from the experiments demonstrate that IADS could effectively detect attacks and normal packets more precisely compared to previous work and the ADS which performs intrusion detections without employing the SPHID method. In addition, the detection time of IADS is much improved as compared to ADS. Thus, IADS is a better solution for anomaly detection methods in detecting untrustworthy behaviour and to define attack and normal behaviours more accurately.