Staff View: Content-Based Feature Extraction and Extreme Learning Machine for Optimizing File Cluster Types Identification

Content-Based Feature Extraction and Extreme Learning Machine for Optimizing File Cluster Types Identification

Recent research in digital forensic attempts to classify image clusters into JPEG or non-JPEG clusters before recovering JPEG image files. This issue might improve the recovering JPEG image accuracy and reduce the processing time. In this work, three content-based feature extraction methods are used...

Full description

Saved in:

Bibliographic Details
Main Authors:	Ali R.R., Al-Dayyeni W.S., Gunasekaran S.S., Mostafa S.A., Abdulkader A.H., Rachmawanto E.H.
Other Authors:	57200536163
Format:	Conference Paper
Published:	Springer Science and Business Media Deutschland GmbH 2023
Tags:	Add Tag No Tags, Be the first to tag this record!

id	my.uniten.dspace-27232
record_format	dspace
spelling	my.uniten.dspace-272322023-05-29T17:41:19Z Content-Based Feature Extraction and Extreme Learning Machine for Optimizing File Cluster Types Identification Ali R.R. Al-Dayyeni W.S. Gunasekaran S.S. Mostafa S.A. Abdulkader A.H. Rachmawanto E.H. 57200536163 57225961808 55652730500 37036085800 57545111700 57193850466 Recent research in digital forensic attempts to classify image clusters into JPEG or non-JPEG clusters before recovering JPEG image files. This issue might improve the recovering JPEG image accuracy and reduce the processing time. In this work, three content-based feature extraction methods are used. The Rate of Change (RoC) is used for tracking relevant bytes in the appropriate groups of their orders. Entropy and Byte Frequency Distribution (BFD) are used to produce an image cluster histogram based on the size of the byte value. Subsequently, we deploy the Extreme Learning Machine (ELM) classifier to evaluate these three features. The ELM identifies the type based on the generated feature vector, whether a JPEG file or a non-JPEG file type. The proposed method is implemented in MATLAB 2017a software and tested and evaluated by using the DFRWS dataset. The test results show that the ELM produces high classification accuracy in identifying the file type. The difference in accuracy between the combinations of the tested features is relatively small. The worst accuracy is generated when the entropy method is used, which is 72.62%, and the best accuracy of 93.46% is generated when using a combination of the three features. � 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG. Final 2023-05-29T09:41:19Z 2023-05-29T09:41:19Z 2022 Conference Paper 10.1007/978-3-030-98015-3_21 2-s2.0-85126979417 https://www.scopus.com/inward/record.uri?eid=2-s2.0-85126979417&doi=10.1007%2f978-3-030-98015-3_21&partnerID=40&md5=c1f85c0a4dcac107ced9c9d91984ebb4 https://irepository.uniten.edu.my/handle/123456789/27232 439 LNNS 314 325 Springer Science and Business Media Deutschland GmbH Scopus
institution	Universiti Tenaga Nasional
building	UNITEN Library
collection	Institutional Repository
continent	Asia
country	Malaysia
content_provider	Universiti Tenaga Nasional
content_source	UNITEN Institutional Repository
url_provider	http://dspace.uniten.edu.my/
description	Recent research in digital forensic attempts to classify image clusters into JPEG or non-JPEG clusters before recovering JPEG image files. This issue might improve the recovering JPEG image accuracy and reduce the processing time. In this work, three content-based feature extraction methods are used. The Rate of Change (RoC) is used for tracking relevant bytes in the appropriate groups of their orders. Entropy and Byte Frequency Distribution (BFD) are used to produce an image cluster histogram based on the size of the byte value. Subsequently, we deploy the Extreme Learning Machine (ELM) classifier to evaluate these three features. The ELM identifies the type based on the generated feature vector, whether a JPEG file or a non-JPEG file type. The proposed method is implemented in MATLAB 2017a software and tested and evaluated by using the DFRWS dataset. The test results show that the ELM produces high classification accuracy in identifying the file type. The difference in accuracy between the combinations of the tested features is relatively small. The worst accuracy is generated when the entropy method is used, which is 72.62%, and the best accuracy of 93.46% is generated when using a combination of the three features. � 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.
author2	57200536163
author_facet	57200536163 Ali R.R. Al-Dayyeni W.S. Gunasekaran S.S. Mostafa S.A. Abdulkader A.H. Rachmawanto E.H.
format	Conference Paper
author	Ali R.R. Al-Dayyeni W.S. Gunasekaran S.S. Mostafa S.A. Abdulkader A.H. Rachmawanto E.H.
spellingShingle	Ali R.R. Al-Dayyeni W.S. Gunasekaran S.S. Mostafa S.A. Abdulkader A.H. Rachmawanto E.H. Content-Based Feature Extraction and Extreme Learning Machine for Optimizing File Cluster Types Identification
author_sort	Ali R.R.
title	Content-Based Feature Extraction and Extreme Learning Machine for Optimizing File Cluster Types Identification
title_short	Content-Based Feature Extraction and Extreme Learning Machine for Optimizing File Cluster Types Identification
title_full	Content-Based Feature Extraction and Extreme Learning Machine for Optimizing File Cluster Types Identification
title_fullStr	Content-Based Feature Extraction and Extreme Learning Machine for Optimizing File Cluster Types Identification
title_full_unstemmed	Content-Based Feature Extraction and Extreme Learning Machine for Optimizing File Cluster Types Identification
title_sort	content-based feature extraction and extreme learning machine for optimizing file cluster types identification
publisher	Springer Science and Business Media Deutschland GmbH
publishDate	2023
_version_	1806424276544258048
score	13.23648

Content-Based Feature Extraction and Extreme Learning Machine for Optimizing File Cluster Types Identification

Similar Items