Dual experience replay enhanced deep deterministic policy gradient for efficient continuous data sampling

To address the inefficiencies in sample utilization and policy instability in asynchronous distributed reinforcement learning, we propose TPDEB—a dual experience replay framework that integrates prioritized sampling and temporal diversity. While recent distributed RL systems have scaled well, they o...

Full description

Saved in:
Bibliographic Details
Main Authors: Mohd Aris, Teh Noranis, Chen, Ningning, Mustapha, Norwati, Zolkepli, Maslina
Format: Article
Language:en
Published: Public Library of Science 2025
Subjects:
Online Access:http://psasir.upm.edu.my/id/eprint/124660/1/124660.pdf
http://psasir.upm.edu.my/id/eprint/124660/
https://dx.plos.org/10.1371/journal.pone.0334411
Tags: Add Tag
No Tags, Be the first to tag this record!