Distributed B-SDLM: accelerating the training convergence of deep neural networks through parallelism
This paper proposes an efficient asynchronous stochastic second order learning algorithm for distributed learning of neural networks (NNs). The proposed algorithm, named distributed bounded stochastic diagonal Levenberg-Marquardt (distributed B-SDLM), is based on the B-SDLM algorithm that converges...
Saved in:
Main Authors: | , , |
---|---|
Format: | Conference or Workshop Item |
Published: |
Springer Verlag
2016
|
Subjects: | |
Online Access: | http://eprints.utm.my/id/eprint/73608/ https://www.scopus.com/inward/record.uri?eid=2-s2.0-84984849069&doi=10.1007%2f978-3-319-42911-3_20&partnerID=40&md5=94c32f776924d8a3b0136f9f8f9c873f |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
my.utm.73608 |
---|---|
record_format |
eprints |
spelling |
my.utm.736082017-11-28T05:01:13Z http://eprints.utm.my/id/eprint/73608/ Distributed B-SDLM: accelerating the training convergence of deep neural networks through parallelism Liew, S. S. Khalil-Hani, M. Bakhteri, R. TK Electrical engineering. Electronics Nuclear engineering This paper proposes an efficient asynchronous stochastic second order learning algorithm for distributed learning of neural networks (NNs). The proposed algorithm, named distributed bounded stochastic diagonal Levenberg-Marquardt (distributed B-SDLM), is based on the B-SDLM algorithm that converges fast and requires only minimal computational overhead than the stochastic gradient descent (SGD) method. The proposed algorithm is implemented based on the parameter server thread model in the MPICH implementation. Experiments on the MNIST dataset have shown that training using the distributed B-SDLM on a 16-core CPU cluster allows the convolutional neural network (CNN) model to reach the convergence state very fast, with speedups of 6.03× and 12.28× to reach 0.01 training and 0.08 testing loss values, respectively. This also results in significantly less time taken to reach a certain classification accuracy (5.67× and 8.72× faster to reach 99% training and 98% testing accuracies on the MNIST dataset, respectively). Springer Verlag 2016 Conference or Workshop Item PeerReviewed Liew, S. S. and Khalil-Hani, M. and Bakhteri, R. (2016) Distributed B-SDLM: accelerating the training convergence of deep neural networks through parallelism. In: 14th Pacific Rim International Conference on Artificial Intelligence, PRICAI 2016, 22-26 Aug 2016, Phuket, Thailand. https://www.scopus.com/inward/record.uri?eid=2-s2.0-84984849069&doi=10.1007%2f978-3-319-42911-3_20&partnerID=40&md5=94c32f776924d8a3b0136f9f8f9c873f |
institution |
Universiti Teknologi Malaysia |
building |
UTM Library |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Teknologi Malaysia |
content_source |
UTM Institutional Repository |
url_provider |
http://eprints.utm.my/ |
topic |
TK Electrical engineering. Electronics Nuclear engineering |
spellingShingle |
TK Electrical engineering. Electronics Nuclear engineering Liew, S. S. Khalil-Hani, M. Bakhteri, R. Distributed B-SDLM: accelerating the training convergence of deep neural networks through parallelism |
description |
This paper proposes an efficient asynchronous stochastic second order learning algorithm for distributed learning of neural networks (NNs). The proposed algorithm, named distributed bounded stochastic diagonal Levenberg-Marquardt (distributed B-SDLM), is based on the B-SDLM algorithm that converges fast and requires only minimal computational overhead than the stochastic gradient descent (SGD) method. The proposed algorithm is implemented based on the parameter server thread model in the MPICH implementation. Experiments on the MNIST dataset have shown that training using the distributed B-SDLM on a 16-core CPU cluster allows the convolutional neural network (CNN) model to reach the convergence state very fast, with speedups of 6.03× and 12.28× to reach 0.01 training and 0.08 testing loss values, respectively. This also results in significantly less time taken to reach a certain classification accuracy (5.67× and 8.72× faster to reach 99% training and 98% testing accuracies on the MNIST dataset, respectively). |
format |
Conference or Workshop Item |
author |
Liew, S. S. Khalil-Hani, M. Bakhteri, R. |
author_facet |
Liew, S. S. Khalil-Hani, M. Bakhteri, R. |
author_sort |
Liew, S. S. |
title |
Distributed B-SDLM: accelerating the training convergence of deep neural networks through parallelism |
title_short |
Distributed B-SDLM: accelerating the training convergence of deep neural networks through parallelism |
title_full |
Distributed B-SDLM: accelerating the training convergence of deep neural networks through parallelism |
title_fullStr |
Distributed B-SDLM: accelerating the training convergence of deep neural networks through parallelism |
title_full_unstemmed |
Distributed B-SDLM: accelerating the training convergence of deep neural networks through parallelism |
title_sort |
distributed b-sdlm: accelerating the training convergence of deep neural networks through parallelism |
publisher |
Springer Verlag |
publishDate |
2016 |
url |
http://eprints.utm.my/id/eprint/73608/ https://www.scopus.com/inward/record.uri?eid=2-s2.0-84984849069&doi=10.1007%2f978-3-319-42911-3_20&partnerID=40&md5=94c32f776924d8a3b0136f9f8f9c873f |
_version_ |
1643656697052397568 |
score |
13.211869 |