Staff View: Parametric flatten-t swish: an adaptive nonlinear activation function for deep learning

Parametric flatten-t swish: an adaptive nonlinear activation function for deep learning

Activation function is a key component in deep learning that performs non-linear mappings between the inputs and outputs. Rectified Linear Unit (ReLU) has been the most popular activation function across the deep learning community. However, ReLU contains several shortcomings that can result in ine...

Full description

Saved in:

Bibliographic Details
Main Authors:	Hock, Hung Chieng, Wahid, Noorhaniza, Ong, Pauline
Format:	Article
Language:	English
Published:	Universiti Utara Malaysia 2021
Subjects:	QA75 Electronic computers. Computer science
Online Access:	https://repo.uum.edu.my/id/eprint/28125/1/document%20%284%29.pdf https://doi.org/10.32890/jict.20.1.2021.9267 https://repo.uum.edu.my/id/eprint/28125/ https://www.e-journal.uum.edu.my/index.php/jict/article/view/12398 https://doi.org/10.32890/jict.20.1.2021.9267
Tags:	Add Tag No Tags, Be the first to tag this record!

id	my.uum.repo.28125
record_format	eprints
spelling	my.uum.repo.281252023-05-21T15:21:24Z https://repo.uum.edu.my/id/eprint/28125/ Parametric flatten-t swish: an adaptive nonlinear activation function for deep learning Hock, Hung Chieng Wahid, Noorhaniza Ong, Pauline QA75 Electronic computers. Computer science Activation function is a key component in deep learning that performs non-linear mappings between the inputs and outputs. Rectified Linear Unit (ReLU) has been the most popular activation function across the deep learning community. However, ReLU contains several shortcomings that can result in inefficient training of the deep neural networks, these are: 1) the negative cancellation property of ReLU tends to treat negative inputs as unimportant information for the learning, resulting in performance degradation; 2) the inherent predefined nature of ReLU is unlikely to promote additional flexibility, expressivity, and robustness to the networks; 3) the mean activation of ReLU is highly positive and leads to bias shift effect in network layers; and 4) the multi linear structure of ReLU restricts the non-linear approximation power of the networks. To tackle these shortcomings, this paper introduced Parametric Flatten-T Swish (PFTS) as an alternative to ReLU. By taking ReLU as a baseline method, the experiments showed that PFTS improved classification accuracy on SVHN dataset by 0.31%, 0.98%, 2.16%, 17.72%, 1.35%, 0.97%, 39.99%, and 71.83% on DNN-3A, DNN-3B, DNN-4, DNN5A, DNN-5B, DNN-5C, DNN-6, and DNN-7, respectively. Besides, PFTS also achieved the highest mean rank among the comparison methods. The proposed PFTS manifested higher non-linear approximation power during training and thereby improved the predictive performance of the networks. Universiti Utara Malaysia 2021 Article PeerReviewed application/pdf en cc4_by https://repo.uum.edu.my/id/eprint/28125/1/document%20%284%29.pdf Hock, Hung Chieng and Wahid, Noorhaniza and Ong, Pauline (2021) Parametric flatten-t swish: an adaptive nonlinear activation function for deep learning. Journal of Information and Communication Technology (JICT), 20 (1). pp. 21-39. ISSN 1675-414X https://www.e-journal.uum.edu.my/index.php/jict/article/view/12398 https://doi.org/10.32890/jict.20.1.2021.9267 https://doi.org/10.32890/jict.20.1.2021.9267
institution	Universiti Utara Malaysia
building	UUM Library
collection	Institutional Repository
continent	Asia
country	Malaysia
content_provider	Universiti Utara Malaysia
content_source	UUM Institutional Repository
url_provider	http://repo.uum.edu.my/
language	English
topic	QA75 Electronic computers. Computer science
spellingShingle	QA75 Electronic computers. Computer science Hock, Hung Chieng Wahid, Noorhaniza Ong, Pauline Parametric flatten-t swish: an adaptive nonlinear activation function for deep learning
description	Activation function is a key component in deep learning that performs non-linear mappings between the inputs and outputs. Rectified Linear Unit (ReLU) has been the most popular activation function across the deep learning community. However, ReLU contains several shortcomings that can result in inefficient training of the deep neural networks, these are: 1) the negative cancellation property of ReLU tends to treat negative inputs as unimportant information for the learning, resulting in performance degradation; 2) the inherent predefined nature of ReLU is unlikely to promote additional flexibility, expressivity, and robustness to the networks; 3) the mean activation of ReLU is highly positive and leads to bias shift effect in network layers; and 4) the multi linear structure of ReLU restricts the non-linear approximation power of the networks. To tackle these shortcomings, this paper introduced Parametric Flatten-T Swish (PFTS) as an alternative to ReLU. By taking ReLU as a baseline method, the experiments showed that PFTS improved classification accuracy on SVHN dataset by 0.31%, 0.98%, 2.16%, 17.72%, 1.35%, 0.97%, 39.99%, and 71.83% on DNN-3A, DNN-3B, DNN-4, DNN5A, DNN-5B, DNN-5C, DNN-6, and DNN-7, respectively. Besides, PFTS also achieved the highest mean rank among the comparison methods. The proposed PFTS manifested higher non-linear approximation power during training and thereby improved the predictive performance of the networks.
format	Article
author	Hock, Hung Chieng Wahid, Noorhaniza Ong, Pauline
author_facet	Hock, Hung Chieng Wahid, Noorhaniza Ong, Pauline
author_sort	Hock, Hung Chieng
title	Parametric flatten-t swish: an adaptive nonlinear activation function for deep learning
title_short	Parametric flatten-t swish: an adaptive nonlinear activation function for deep learning
title_full	Parametric flatten-t swish: an adaptive nonlinear activation function for deep learning
title_fullStr	Parametric flatten-t swish: an adaptive nonlinear activation function for deep learning
title_full_unstemmed	Parametric flatten-t swish: an adaptive nonlinear activation function for deep learning
title_sort	parametric flatten-t swish: an adaptive nonlinear activation function for deep learning
publisher	Universiti Utara Malaysia
publishDate	2021
url	https://repo.uum.edu.my/id/eprint/28125/1/document%20%284%29.pdf https://doi.org/10.32890/jict.20.1.2021.9267 https://repo.uum.edu.my/id/eprint/28125/ https://www.e-journal.uum.edu.my/index.php/jict/article/view/12398 https://doi.org/10.32890/jict.20.1.2021.9267
_version_	1768010679040606208
score	13.211869

Parametric flatten-t swish: an adaptive nonlinear activation function for deep learning

Similar Items