Deep reinforcement learning with robust deep deterministic policy gradient
Recently, Deep Deterministic Policy Gradient (DDPG) is a popular deep reinforcement learning algorithms applied to continuous control problems like autonomous driving and robotics. Although DDPG can produce very good results, it has its drawbacks. DDPG can become unstable and heavily dependent on se...
محفوظ في:
المؤلفون الرئيسيون: | , , , |
---|---|
التنسيق: | Proceedings |
اللغة: | English |
منشور في: |
IEEE Xplore
2020
|
الموضوعات: | |
الوصول للمادة أونلاين: | https://eprints.ums.edu.my/id/eprint/27893/1/Deep%20reinforcement%20learning%20with%20robust%20deep%20deterministic%20policy%20gradient-Abstract.pdf https://eprints.ums.edu.my/id/eprint/27893/ https://ieeexplore.ieee.org/document/9309539 |
الوسوم: |
إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
|
id |
my.ums.eprints.27893 |
---|---|
record_format |
eprints |
spelling |
my.ums.eprints.278932021-07-07T07:01:59Z https://eprints.ums.edu.my/id/eprint/27893/ Deep reinforcement learning with robust deep deterministic policy gradient Teckchai Tiong Ismail Saad Kenneth Tze Kin Teo Herwansyah Lago T Technology (General) TK Electrical engineering. Electronics Nuclear engineering Recently, Deep Deterministic Policy Gradient (DDPG) is a popular deep reinforcement learning algorithms applied to continuous control problems like autonomous driving and robotics. Although DDPG can produce very good results, it has its drawbacks. DDPG can become unstable and heavily dependent on searching the correct hyperparameters for the current task. DDPG algorithm risk overestimating the Q values in the critic (value) network. The accumulation of estimation errors as time elapse can result in the reinforcement agent trapping into a local optimum or suffering from disastrous forgetting. Twin Delayed DDPG (TD3) mitigated the overestimation bias problem but might not exploit full performance due to underestimation bias. In this paper Twin Average Delayed DDPG (TAD3) is proposed for specific adaption to TD3 and shows that the resulting algorithm perform better than TD3 in a challenging continuous control environment. IEEE Xplore 2020-11-28 Proceedings PeerReviewed text en https://eprints.ums.edu.my/id/eprint/27893/1/Deep%20reinforcement%20learning%20with%20robust%20deep%20deterministic%20policy%20gradient-Abstract.pdf Teckchai Tiong and Ismail Saad and Kenneth Tze Kin Teo and Herwansyah Lago (2020) Deep reinforcement learning with robust deep deterministic policy gradient. https://ieeexplore.ieee.org/document/9309539 |
institution |
Universiti Malaysia Sabah |
building |
UMS Library |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Malaysia Sabah |
content_source |
UMS Institutional Repository |
url_provider |
http://eprints.ums.edu.my/ |
language |
English |
topic |
T Technology (General) TK Electrical engineering. Electronics Nuclear engineering |
spellingShingle |
T Technology (General) TK Electrical engineering. Electronics Nuclear engineering Teckchai Tiong Ismail Saad Kenneth Tze Kin Teo Herwansyah Lago Deep reinforcement learning with robust deep deterministic policy gradient |
description |
Recently, Deep Deterministic Policy Gradient (DDPG) is a popular deep reinforcement learning algorithms applied to continuous control problems like autonomous driving and robotics. Although DDPG can produce very good results, it has its drawbacks. DDPG can become unstable and heavily dependent on searching the correct hyperparameters for the current task. DDPG algorithm risk overestimating the Q values in the critic (value) network. The accumulation of estimation errors as time elapse can result in the reinforcement agent trapping into a local optimum or suffering from disastrous forgetting. Twin Delayed DDPG (TD3) mitigated the overestimation bias problem but might not exploit full performance due to underestimation bias. In this paper Twin Average Delayed DDPG (TAD3) is proposed for specific adaption to TD3 and shows that the resulting algorithm perform better than TD3 in a challenging continuous control environment. |
format |
Proceedings |
author |
Teckchai Tiong Ismail Saad Kenneth Tze Kin Teo Herwansyah Lago |
author_facet |
Teckchai Tiong Ismail Saad Kenneth Tze Kin Teo Herwansyah Lago |
author_sort |
Teckchai Tiong |
title |
Deep reinforcement learning with robust deep deterministic policy gradient |
title_short |
Deep reinforcement learning with robust deep deterministic policy gradient |
title_full |
Deep reinforcement learning with robust deep deterministic policy gradient |
title_fullStr |
Deep reinforcement learning with robust deep deterministic policy gradient |
title_full_unstemmed |
Deep reinforcement learning with robust deep deterministic policy gradient |
title_sort |
deep reinforcement learning with robust deep deterministic policy gradient |
publisher |
IEEE Xplore |
publishDate |
2020 |
url |
https://eprints.ums.edu.my/id/eprint/27893/1/Deep%20reinforcement%20learning%20with%20robust%20deep%20deterministic%20policy%20gradient-Abstract.pdf https://eprints.ums.edu.my/id/eprint/27893/ https://ieeexplore.ieee.org/document/9309539 |
_version_ |
1760230648595873792 |
score |
13.250246 |