Asynchronous Non-Blocking Algorithm to Handle Straggler Reduce Tasks in Hadoop System
Hadoop is widely adopted as a big data processing application as it can run on commercial hardware at a reasonable time. Hadoop uses asynchronous blocking concurrency using Thread and Future class. Therefore, in some cases such as network link or hardware failure, a running task may block other task...
Saved in:
Main Authors: | , , |
---|---|
Format: | Article |
Published: |
Insight Society
2020
|
Online Access: | https://www.scopus.com/inward/record.uri?eid=2-s2.0-85097534444&doi=10.18517%2fijaseit.10.5.9073&partnerID=40&md5=06bb7f5641e7773e25a91dc64089e2e0 http://eprints.utp.edu.my/23113/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
my.utp.eprints.23113 |
---|---|
record_format |
eprints |
spelling |
my.utp.eprints.231132021-08-19T05:26:36Z Asynchronous Non-Blocking Algorithm to Handle Straggler Reduce Tasks in Hadoop System Khoiruddin, A.A. Zakaria, N. Alhussian, H. Hadoop is widely adopted as a big data processing application as it can run on commercial hardware at a reasonable time. Hadoop uses asynchronous blocking concurrency using Thread and Future class. Therefore, in some cases such as network link or hardware failure, a running task may block other tasks from running (the task becomes straggler). Hadoop releases are equipped with algorithms to handle straggler tasks problem. However, the algorithms manage Map and Reduce task similarly, while the straggler root cause might be different for both tasks. In this paper, the Asynchronous Non-Blocking (ANB) method is proposed to improve the performance and avoid the blocking of Reduce task in Hadoop. Instead of using the single queue, our approach uses two queues, i.e. task queue and callback queue. When a task is not ready or detected as a straggler, it is removed from the main task queue and temporarily sent to the callback queue. When the task is ready to run, it will be sent back to the main task queue for running. The performance of the algorithm is compared with rTuner, the latest paper found on handling straggler task in Reduce task. From the comparison, it is shown that ANB consistently gives faster time to complete because any unready tasks will be directly put into the callback queue without blocking other tasks. Furthermore, the overhead time in rTuner is high as it needs to check the straggler status and to find the reason for a task to become straggler. © Insight Society 2020 Article NonPeerReviewed https://www.scopus.com/inward/record.uri?eid=2-s2.0-85097534444&doi=10.18517%2fijaseit.10.5.9073&partnerID=40&md5=06bb7f5641e7773e25a91dc64089e2e0 Khoiruddin, A.A. and Zakaria, N. and Alhussian, H. (2020) Asynchronous Non-Blocking Algorithm to Handle Straggler Reduce Tasks in Hadoop System. International Journal on Advanced Science, Engineering and Information Technology, 10 (5). pp. 1913-1919. http://eprints.utp.edu.my/23113/ |
institution |
Universiti Teknologi Petronas |
building |
UTP Resource Centre |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Teknologi Petronas |
content_source |
UTP Institutional Repository |
url_provider |
http://eprints.utp.edu.my/ |
description |
Hadoop is widely adopted as a big data processing application as it can run on commercial hardware at a reasonable time. Hadoop uses asynchronous blocking concurrency using Thread and Future class. Therefore, in some cases such as network link or hardware failure, a running task may block other tasks from running (the task becomes straggler). Hadoop releases are equipped with algorithms to handle straggler tasks problem. However, the algorithms manage Map and Reduce task similarly, while the straggler root cause might be different for both tasks. In this paper, the Asynchronous Non-Blocking (ANB) method is proposed to improve the performance and avoid the blocking of Reduce task in Hadoop. Instead of using the single queue, our approach uses two queues, i.e. task queue and callback queue. When a task is not ready or detected as a straggler, it is removed from the main task queue and temporarily sent to the callback queue. When the task is ready to run, it will be sent back to the main task queue for running. The performance of the algorithm is compared with rTuner, the latest paper found on handling straggler task in Reduce task. From the comparison, it is shown that ANB consistently gives faster time to complete because any unready tasks will be directly put into the callback queue without blocking other tasks. Furthermore, the overhead time in rTuner is high as it needs to check the straggler status and to find the reason for a task to become straggler. © |
format |
Article |
author |
Khoiruddin, A.A. Zakaria, N. Alhussian, H. |
spellingShingle |
Khoiruddin, A.A. Zakaria, N. Alhussian, H. Asynchronous Non-Blocking Algorithm to Handle Straggler Reduce Tasks in Hadoop System |
author_facet |
Khoiruddin, A.A. Zakaria, N. Alhussian, H. |
author_sort |
Khoiruddin, A.A. |
title |
Asynchronous Non-Blocking Algorithm to Handle Straggler Reduce Tasks in Hadoop System |
title_short |
Asynchronous Non-Blocking Algorithm to Handle Straggler Reduce Tasks in Hadoop System |
title_full |
Asynchronous Non-Blocking Algorithm to Handle Straggler Reduce Tasks in Hadoop System |
title_fullStr |
Asynchronous Non-Blocking Algorithm to Handle Straggler Reduce Tasks in Hadoop System |
title_full_unstemmed |
Asynchronous Non-Blocking Algorithm to Handle Straggler Reduce Tasks in Hadoop System |
title_sort |
asynchronous non-blocking algorithm to handle straggler reduce tasks in hadoop system |
publisher |
Insight Society |
publishDate |
2020 |
url |
https://www.scopus.com/inward/record.uri?eid=2-s2.0-85097534444&doi=10.18517%2fijaseit.10.5.9073&partnerID=40&md5=06bb7f5641e7773e25a91dc64089e2e0 http://eprints.utp.edu.my/23113/ |
_version_ |
1738656426047307776 |
score |
13.211869 |