Analyzing the resilience of convolutional neural networks implemented on GPUs: Alexnet as a case study

There have been an extensive use of Convolutional Neural Networks (CNNs) in healthcare applications. Presently, GPUs are the most prominent and dominated DNN accelerators to increase the execution speed of CNN algorithms to improve their performance as well as the Latency. However, GPUs are prone to...

Full description

Saved in:
Bibliographic Details
Main Authors: Khalid Adam, Ismail Hammad, Izzeldin, I. Mohd, Ibrahim, Younis
Format: Article
Language:English
Published: Faculty of Electrical Engineering, Computer Science and Information Technology, Josip Juraj Strossmayer University of Osijek 2021
Subjects:
Online Access:http://umpir.ump.edu.my/id/eprint/32465/1/Analyzing%20the%20resilience%20of%20convolutional%20neural%20networks%20implemented%20on%20GPUs.pdf
http://umpir.ump.edu.my/id/eprint/32465/
https://doi.org/10.32985/ijeces.12.2.4
https://doi.org/10.32985/ijeces.12.2.4
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.ump.umpir.32465
record_format eprints
spelling my.ump.umpir.324652021-10-28T07:00:14Z http://umpir.ump.edu.my/id/eprint/32465/ Analyzing the resilience of convolutional neural networks implemented on GPUs: Alexnet as a case study Khalid Adam, Ismail Hammad Izzeldin, I. Mohd Ibrahim, Younis TK Electrical engineering. Electronics Nuclear engineering There have been an extensive use of Convolutional Neural Networks (CNNs) in healthcare applications. Presently, GPUs are the most prominent and dominated DNN accelerators to increase the execution speed of CNN algorithms to improve their performance as well as the Latency. However, GPUs are prone to soft errors. These errors can impact the behaviors of the GPU dramatically. Thus, the generated fault may corrupt data values or logic operations and cause errors, such as Silent Data Corruption. unfortunately, soft errors propagate from the physical level (microarchitecture) to the application level (CNN model). This paper analyzes the reliability of the AlexNet model based on two metrics: (1) critical kernel vulnerability (CKV) used to identify the malfunction and light- malfunction errors in each kernel, and (2) critical layer vulnerability (CLV) used to track the malfunction and light-malfunction errors through layers. To achieve this, we injected the AlexNet which was popularly used in healthcare applications on NVIDIA’s GPU, using the SASSIFI fault injector as the major evaluator tool. The experiments demonstrate through the average error percentage that caused malfunction of the models has been reduced from 3.7% to 0.383% by hardening only the vulnerable part with the overhead only 0.2923%. This is a high improvement in the model reliability for healthcare applications. Faculty of Electrical Engineering, Computer Science and Information Technology, Josip Juraj Strossmayer University of Osijek 2021-06-21 Article PeerReviewed pdf en http://umpir.ump.edu.my/id/eprint/32465/1/Analyzing%20the%20resilience%20of%20convolutional%20neural%20networks%20implemented%20on%20GPUs.pdf Khalid Adam, Ismail Hammad and Izzeldin, I. Mohd and Ibrahim, Younis (2021) Analyzing the resilience of convolutional neural networks implemented on GPUs: Alexnet as a case study. International Journal of Electrical and Computer Engineering Systems, 12 (2). pp. 91-103. ISSN 1847-6996 https://doi.org/10.32985/ijeces.12.2.4 https://doi.org/10.32985/ijeces.12.2.4
institution Universiti Malaysia Pahang
building UMP Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Malaysia Pahang
content_source UMP Institutional Repository
url_provider http://umpir.ump.edu.my/
language English
topic TK Electrical engineering. Electronics Nuclear engineering
spellingShingle TK Electrical engineering. Electronics Nuclear engineering
Khalid Adam, Ismail Hammad
Izzeldin, I. Mohd
Ibrahim, Younis
Analyzing the resilience of convolutional neural networks implemented on GPUs: Alexnet as a case study
description There have been an extensive use of Convolutional Neural Networks (CNNs) in healthcare applications. Presently, GPUs are the most prominent and dominated DNN accelerators to increase the execution speed of CNN algorithms to improve their performance as well as the Latency. However, GPUs are prone to soft errors. These errors can impact the behaviors of the GPU dramatically. Thus, the generated fault may corrupt data values or logic operations and cause errors, such as Silent Data Corruption. unfortunately, soft errors propagate from the physical level (microarchitecture) to the application level (CNN model). This paper analyzes the reliability of the AlexNet model based on two metrics: (1) critical kernel vulnerability (CKV) used to identify the malfunction and light- malfunction errors in each kernel, and (2) critical layer vulnerability (CLV) used to track the malfunction and light-malfunction errors through layers. To achieve this, we injected the AlexNet which was popularly used in healthcare applications on NVIDIA’s GPU, using the SASSIFI fault injector as the major evaluator tool. The experiments demonstrate through the average error percentage that caused malfunction of the models has been reduced from 3.7% to 0.383% by hardening only the vulnerable part with the overhead only 0.2923%. This is a high improvement in the model reliability for healthcare applications.
format Article
author Khalid Adam, Ismail Hammad
Izzeldin, I. Mohd
Ibrahim, Younis
author_facet Khalid Adam, Ismail Hammad
Izzeldin, I. Mohd
Ibrahim, Younis
author_sort Khalid Adam, Ismail Hammad
title Analyzing the resilience of convolutional neural networks implemented on GPUs: Alexnet as a case study
title_short Analyzing the resilience of convolutional neural networks implemented on GPUs: Alexnet as a case study
title_full Analyzing the resilience of convolutional neural networks implemented on GPUs: Alexnet as a case study
title_fullStr Analyzing the resilience of convolutional neural networks implemented on GPUs: Alexnet as a case study
title_full_unstemmed Analyzing the resilience of convolutional neural networks implemented on GPUs: Alexnet as a case study
title_sort analyzing the resilience of convolutional neural networks implemented on gpus: alexnet as a case study
publisher Faculty of Electrical Engineering, Computer Science and Information Technology, Josip Juraj Strossmayer University of Osijek
publishDate 2021
url http://umpir.ump.edu.my/id/eprint/32465/1/Analyzing%20the%20resilience%20of%20convolutional%20neural%20networks%20implemented%20on%20GPUs.pdf
http://umpir.ump.edu.my/id/eprint/32465/
https://doi.org/10.32985/ijeces.12.2.4
https://doi.org/10.32985/ijeces.12.2.4
_version_ 1715189890289762304
score 13.211869