Staff View: Experimental analysis in Hadoop MapReduce: A closer look at fault detection and recovery techniques

Experimental analysis in Hadoop MapReduce: A closer look at fault detection and recovery techniques

Hadoop MapReduce reactively detects and recovers faults after they occur based on the static heartbeat detection and the re-execution from scratch techniques. However, these techniques lead to excessive response time penalties and inefficient resource consumption during detection and recovery. Exist...

Full description

Saved in:

Bibliographic Details
Main Authors:	Saadoon, Muntadher, Hamid, Siti Hafizah Ab, Sofian, Hazrina, Altarturi, Hamza, Nasuha, Nur, Azizul, Zati Hakim, Sani, Asmiza Abdul, Asemi, Adeleh
Format:	Article
Published:	MDPI 2021
Subjects:	QD Chemistry TA Engineering (General). Civil engineering (General)
Online Access:	http://eprints.um.edu.my/33921/
Tags:	Add Tag No Tags, Be the first to tag this record!

id	my.um.eprints.33921
record_format	eprints
spelling	my.um.eprints.339212022-07-12T04:45:50Z http://eprints.um.edu.my/33921/ Experimental analysis in Hadoop MapReduce: A closer look at fault detection and recovery techniques Saadoon, Muntadher Hamid, Siti Hafizah Ab Sofian, Hazrina Altarturi, Hamza Nasuha, Nur Azizul, Zati Hakim Sani, Asmiza Abdul Asemi, Adeleh QD Chemistry TA Engineering (General). Civil engineering (General) Hadoop MapReduce reactively detects and recovers faults after they occur based on the static heartbeat detection and the re-execution from scratch techniques. However, these techniques lead to excessive response time penalties and inefficient resource consumption during detection and recovery. Existing fault-tolerance solutions intend to mitigate the limitations without considering critical conditions such as fail-slow faults, the impact of faults at various infrastructure levels and the relationship between the detection and recovery stages. This paper analyses the response time under two main conditions: fail-stop and fail-slow, when they manifest with node, service, and the task at runtime. In addition, we focus on the relationship between the time for detecting and recovering faults. The experimental analysis is conducted on a real Hadoop cluster comprising MapReduce, YARN and HDFS frameworks. Our analysis shows that the recovery of a single fault leads to an average of 67.6% response time penalty. Even though the detection and recovery times are well-turned, data locality and resource availability must also be considered to obtain the optimum tolerance time and the lowest penalties. MDPI 2021-06 Article PeerReviewed Saadoon, Muntadher and Hamid, Siti Hafizah Ab and Sofian, Hazrina and Altarturi, Hamza and Nasuha, Nur and Azizul, Zati Hakim and Sani, Asmiza Abdul and Asemi, Adeleh (2021) Experimental analysis in Hadoop MapReduce: A closer look at fault detection and recovery techniques. Sensors, 21 (11). ISSN 1424-8220, DOI https://doi.org/10.3390/s21113799 <https://doi.org/10.3390/s21113799>. 10.3390/s21113799
institution	Universiti Malaya
building	UM Library
collection	Institutional Repository
continent	Asia
country	Malaysia
content_provider	Universiti Malaya
content_source	UM Research Repository
url_provider	http://eprints.um.edu.my/
topic	QD Chemistry TA Engineering (General). Civil engineering (General)
spellingShingle	QD Chemistry TA Engineering (General). Civil engineering (General) Saadoon, Muntadher Hamid, Siti Hafizah Ab Sofian, Hazrina Altarturi, Hamza Nasuha, Nur Azizul, Zati Hakim Sani, Asmiza Abdul Asemi, Adeleh Experimental analysis in Hadoop MapReduce: A closer look at fault detection and recovery techniques
description	Hadoop MapReduce reactively detects and recovers faults after they occur based on the static heartbeat detection and the re-execution from scratch techniques. However, these techniques lead to excessive response time penalties and inefficient resource consumption during detection and recovery. Existing fault-tolerance solutions intend to mitigate the limitations without considering critical conditions such as fail-slow faults, the impact of faults at various infrastructure levels and the relationship between the detection and recovery stages. This paper analyses the response time under two main conditions: fail-stop and fail-slow, when they manifest with node, service, and the task at runtime. In addition, we focus on the relationship between the time for detecting and recovering faults. The experimental analysis is conducted on a real Hadoop cluster comprising MapReduce, YARN and HDFS frameworks. Our analysis shows that the recovery of a single fault leads to an average of 67.6% response time penalty. Even though the detection and recovery times are well-turned, data locality and resource availability must also be considered to obtain the optimum tolerance time and the lowest penalties.
format	Article
author	Saadoon, Muntadher Hamid, Siti Hafizah Ab Sofian, Hazrina Altarturi, Hamza Nasuha, Nur Azizul, Zati Hakim Sani, Asmiza Abdul Asemi, Adeleh
author_facet	Saadoon, Muntadher Hamid, Siti Hafizah Ab Sofian, Hazrina Altarturi, Hamza Nasuha, Nur Azizul, Zati Hakim Sani, Asmiza Abdul Asemi, Adeleh
author_sort	Saadoon, Muntadher
title	Experimental analysis in Hadoop MapReduce: A closer look at fault detection and recovery techniques
title_short	Experimental analysis in Hadoop MapReduce: A closer look at fault detection and recovery techniques
title_full	Experimental analysis in Hadoop MapReduce: A closer look at fault detection and recovery techniques
title_fullStr	Experimental analysis in Hadoop MapReduce: A closer look at fault detection and recovery techniques
title_full_unstemmed	Experimental analysis in Hadoop MapReduce: A closer look at fault detection and recovery techniques
title_sort	experimental analysis in hadoop mapreduce: a closer look at fault detection and recovery techniques
publisher	MDPI
publishDate	2021
url	http://eprints.um.edu.my/33921/
_version_	1738510691535421440
score	13.211869

Experimental analysis in Hadoop MapReduce: A closer look at fault detection and recovery techniques

Similar Items