Malware behavior profiling from unstructured data
Recently, the emergence of the new malware has caused a major threat especially in finance sector in which many of the online banking data was stolen by the adversaries. The malware threats information needs to be collected immediately after its outbreak. Early detection can save others from being t...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Conference or Workshop Item |
Published: |
2020
|
Subjects: | |
Online Access: | http://eprints.utm.my/id/eprint/92351/ http://dx.doi.org/10.1007/978-3-030-49345-5_14 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
my.utm.92351 |
---|---|
record_format |
eprints |
spelling |
my.utm.923512021-09-28T07:38:56Z http://eprints.utm.my/id/eprint/92351/ Malware behavior profiling from unstructured data Yoong, Jien Chiam Maarof, Mohd. Aizaini Kassim, Mohamad Nizam Zainal, Anazida QA75 Electronic computers. Computer science Recently, the emergence of the new malware has caused a major threat especially in finance sector in which many of the online banking data was stolen by the adversaries. The malware threats information needs to be collected immediately after its outbreak. Early detection can save others from being the victims. Unfortunately, there is time delay to get the new malware information into the Malware Database such as ExploitDB. A pre-emptive way needs to be taken to gather the first-hand information of the new malware as a preventive measure. One of the methods is by extracting information from open source data such as online news by using Named Entity Recognition (NER). However, the existing NER system is incapable to extract the domain specific entities from the online news accurately. The aim of this paper is to extract the malware entities and its behaviour attributes using extended version of NER with HMM and CRF. A malware annotated corpus is produced in order to conduct the supervise learning for the machine learning approach of the name entity tagger. The results show CRF performs slightly better than HMM. Few experiments are performed in order to optimize the performance of CRF in terms of feature extraction. Finally, the malware behaviour information is visualized onto a dashboard by combining few statistical graphs using matplotlib. The purpose of visualizing the malware behaviour profile extracted from the online news is to help cyber security experts to better understand the malware behaviour. 2020 Conference or Workshop Item PeerReviewed Yoong, Jien Chiam and Maarof, Mohd. Aizaini and Kassim, Mohamad Nizam and Zainal, Anazida (2020) Malware behavior profiling from unstructured data. In: 11th International Conference on Soft Computing & Pattern Recognition (SOCPAR 2019), 13 – 15 December 2019, Hyderabad, India. http://dx.doi.org/10.1007/978-3-030-49345-5_14 |
institution |
Universiti Teknologi Malaysia |
building |
UTM Library |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Teknologi Malaysia |
content_source |
UTM Institutional Repository |
url_provider |
http://eprints.utm.my/ |
topic |
QA75 Electronic computers. Computer science |
spellingShingle |
QA75 Electronic computers. Computer science Yoong, Jien Chiam Maarof, Mohd. Aizaini Kassim, Mohamad Nizam Zainal, Anazida Malware behavior profiling from unstructured data |
description |
Recently, the emergence of the new malware has caused a major threat especially in finance sector in which many of the online banking data was stolen by the adversaries. The malware threats information needs to be collected immediately after its outbreak. Early detection can save others from being the victims. Unfortunately, there is time delay to get the new malware information into the Malware Database such as ExploitDB. A pre-emptive way needs to be taken to gather the first-hand information of the new malware as a preventive measure. One of the methods is by extracting information from open source data such as online news by using Named Entity Recognition (NER). However, the existing NER system is incapable to extract the domain specific entities from the online news accurately. The aim of this paper is to extract the malware entities and its behaviour attributes using extended version of NER with HMM and CRF. A malware annotated corpus is produced in order to conduct the supervise learning for the machine learning approach of the name entity tagger. The results show CRF performs slightly better than HMM. Few experiments are performed in order to optimize the performance of CRF in terms of feature extraction. Finally, the malware behaviour information is visualized onto a dashboard by combining few statistical graphs using matplotlib. The purpose of visualizing the malware behaviour profile extracted from the online news is to help cyber security experts to better understand the malware behaviour. |
format |
Conference or Workshop Item |
author |
Yoong, Jien Chiam Maarof, Mohd. Aizaini Kassim, Mohamad Nizam Zainal, Anazida |
author_facet |
Yoong, Jien Chiam Maarof, Mohd. Aizaini Kassim, Mohamad Nizam Zainal, Anazida |
author_sort |
Yoong, Jien Chiam |
title |
Malware behavior profiling from unstructured data |
title_short |
Malware behavior profiling from unstructured data |
title_full |
Malware behavior profiling from unstructured data |
title_fullStr |
Malware behavior profiling from unstructured data |
title_full_unstemmed |
Malware behavior profiling from unstructured data |
title_sort |
malware behavior profiling from unstructured data |
publishDate |
2020 |
url |
http://eprints.utm.my/id/eprint/92351/ http://dx.doi.org/10.1007/978-3-030-49345-5_14 |
_version_ |
1712285082152075264 |
score |
13.211869 |