Deep reinforcement learning with robust deep deterministic policy gradient

Recently, Deep Deterministic Policy Gradient (DDPG) is a popular deep reinforcement learning algorithms applied to continuous control problems like autonomous driving and robotics. Although DDPG can produce very good results, it has its drawbacks. DDPG can become unstable and heavily dependent on se...

Full description

Saved in:
Bibliographic Details
Main Authors: Teckchai Tiong, Ismail Saad, Kenneth Tze Kin Teo, Herwansyah Lago
Format: Proceedings
Language:English
Published: IEEE Xplore 2020
Subjects:
Online Access:https://eprints.ums.edu.my/id/eprint/27893/1/Deep%20reinforcement%20learning%20with%20robust%20deep%20deterministic%20policy%20gradient-Abstract.pdf
https://eprints.ums.edu.my/id/eprint/27893/
https://ieeexplore.ieee.org/document/9309539
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.ums.eprints.27893
record_format eprints
spelling my.ums.eprints.278932021-07-07T07:01:59Z https://eprints.ums.edu.my/id/eprint/27893/ Deep reinforcement learning with robust deep deterministic policy gradient Teckchai Tiong Ismail Saad Kenneth Tze Kin Teo Herwansyah Lago T Technology (General) TK Electrical engineering. Electronics Nuclear engineering Recently, Deep Deterministic Policy Gradient (DDPG) is a popular deep reinforcement learning algorithms applied to continuous control problems like autonomous driving and robotics. Although DDPG can produce very good results, it has its drawbacks. DDPG can become unstable and heavily dependent on searching the correct hyperparameters for the current task. DDPG algorithm risk overestimating the Q values in the critic (value) network. The accumulation of estimation errors as time elapse can result in the reinforcement agent trapping into a local optimum or suffering from disastrous forgetting. Twin Delayed DDPG (TD3) mitigated the overestimation bias problem but might not exploit full performance due to underestimation bias. In this paper Twin Average Delayed DDPG (TAD3) is proposed for specific adaption to TD3 and shows that the resulting algorithm perform better than TD3 in a challenging continuous control environment. IEEE Xplore 2020-11-28 Proceedings PeerReviewed text en https://eprints.ums.edu.my/id/eprint/27893/1/Deep%20reinforcement%20learning%20with%20robust%20deep%20deterministic%20policy%20gradient-Abstract.pdf Teckchai Tiong and Ismail Saad and Kenneth Tze Kin Teo and Herwansyah Lago (2020) Deep reinforcement learning with robust deep deterministic policy gradient. https://ieeexplore.ieee.org/document/9309539
institution Universiti Malaysia Sabah
building UMS Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Malaysia Sabah
content_source UMS Institutional Repository
url_provider http://eprints.ums.edu.my/
language English
topic T Technology (General)
TK Electrical engineering. Electronics Nuclear engineering
spellingShingle T Technology (General)
TK Electrical engineering. Electronics Nuclear engineering
Teckchai Tiong
Ismail Saad
Kenneth Tze Kin Teo
Herwansyah Lago
Deep reinforcement learning with robust deep deterministic policy gradient
description Recently, Deep Deterministic Policy Gradient (DDPG) is a popular deep reinforcement learning algorithms applied to continuous control problems like autonomous driving and robotics. Although DDPG can produce very good results, it has its drawbacks. DDPG can become unstable and heavily dependent on searching the correct hyperparameters for the current task. DDPG algorithm risk overestimating the Q values in the critic (value) network. The accumulation of estimation errors as time elapse can result in the reinforcement agent trapping into a local optimum or suffering from disastrous forgetting. Twin Delayed DDPG (TD3) mitigated the overestimation bias problem but might not exploit full performance due to underestimation bias. In this paper Twin Average Delayed DDPG (TAD3) is proposed for specific adaption to TD3 and shows that the resulting algorithm perform better than TD3 in a challenging continuous control environment.
format Proceedings
author Teckchai Tiong
Ismail Saad
Kenneth Tze Kin Teo
Herwansyah Lago
author_facet Teckchai Tiong
Ismail Saad
Kenneth Tze Kin Teo
Herwansyah Lago
author_sort Teckchai Tiong
title Deep reinforcement learning with robust deep deterministic policy gradient
title_short Deep reinforcement learning with robust deep deterministic policy gradient
title_full Deep reinforcement learning with robust deep deterministic policy gradient
title_fullStr Deep reinforcement learning with robust deep deterministic policy gradient
title_full_unstemmed Deep reinforcement learning with robust deep deterministic policy gradient
title_sort deep reinforcement learning with robust deep deterministic policy gradient
publisher IEEE Xplore
publishDate 2020
url https://eprints.ums.edu.my/id/eprint/27893/1/Deep%20reinforcement%20learning%20with%20robust%20deep%20deterministic%20policy%20gradient-Abstract.pdf
https://eprints.ums.edu.my/id/eprint/27893/
https://ieeexplore.ieee.org/document/9309539
_version_ 1760230648595873792
score 13.211869