A Study on Abstract Policy for Acceleration of Reinforcement Learning
Reinforcement learning (RL) is well known as one of the methods that can be applied to unknown problems. However, because optimization at every state requires trial-and-error, the learning time becomes large when environment has many states. If there exist solutions to similar problems and they are...
Saved in:
Main Authors: | , , |
---|---|
Format: | Conference or Workshop Item |
Language: | English |
Published: |
2014
|
Subjects: | |
Online Access: | http://umpir.ump.edu.my/id/eprint/7452/1/A_Study_on_Abstract_Policy_for_Acceleration_of_Reinforcement_Learning.pdf http://umpir.ump.edu.my/id/eprint/7452/ http://dx.doi.org/10.1109/SICE.2014.6935300 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
my.ump.umpir.7452 |
---|---|
record_format |
eprints |
spelling |
my.ump.umpir.74522016-04-19T07:31:26Z http://umpir.ump.edu.my/id/eprint/7452/ A Study on Abstract Policy for Acceleration of Reinforcement Learning Ahmad Afif, Mohd Faudzi Hirotaka, Takano Junichi, Murata TK Electrical engineering. Electronics Nuclear engineering Reinforcement learning (RL) is well known as one of the methods that can be applied to unknown problems. However, because optimization at every state requires trial-and-error, the learning time becomes large when environment has many states. If there exist solutions to similar problems and they are used during the exploration, some of trial-anderror can be spared and the learning can take a shorter time. In this paper, the authors propose to reuse an abstract policy, a representative of a solution constructed by learning vector quantization (LVQ) algorithm, to improve initial performance of an RL learner in a similar but different problem. Furthermore, it is investigated whether or not the policy can adapt to a new environment while preserving its performance in the old environments. Simulations show good result in terms of the learning acceleration and the adaptation of abstract policy. 2014 Conference or Workshop Item PeerReviewed application/pdf en http://umpir.ump.edu.my/id/eprint/7452/1/A_Study_on_Abstract_Policy_for_Acceleration_of_Reinforcement_Learning.pdf Ahmad Afif, Mohd Faudzi and Hirotaka, Takano and Junichi, Murata (2014) A Study on Abstract Policy for Acceleration of Reinforcement Learning. In: Proceedings of the SICE Annual Conference (SICE), 9-12 Sept. 2014 , Sapporo, Japan. pp. 1793-1798.. http://dx.doi.org/10.1109/SICE.2014.6935300 |
institution |
Universiti Malaysia Pahang |
building |
UMP Library |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Malaysia Pahang |
content_source |
UMP Institutional Repository |
url_provider |
http://umpir.ump.edu.my/ |
language |
English |
topic |
TK Electrical engineering. Electronics Nuclear engineering |
spellingShingle |
TK Electrical engineering. Electronics Nuclear engineering Ahmad Afif, Mohd Faudzi Hirotaka, Takano Junichi, Murata A Study on Abstract Policy for Acceleration of Reinforcement Learning |
description |
Reinforcement learning (RL) is well known as one of the methods that can be applied to unknown problems. However, because optimization at every state requires trial-and-error, the learning time becomes large when environment has many states. If there exist solutions to similar problems and they are used during the exploration, some of trial-anderror can be spared and the learning can take a shorter time. In this paper, the authors propose to reuse an abstract policy, a representative of a solution constructed by learning vector quantization (LVQ) algorithm, to improve initial performance of an RL learner in a similar but different problem. Furthermore, it is investigated whether or not the policy can adapt to a new environment while preserving its performance in the old environments. Simulations show good result in terms of the learning acceleration and the adaptation of abstract policy. |
format |
Conference or Workshop Item |
author |
Ahmad Afif, Mohd Faudzi Hirotaka, Takano Junichi, Murata |
author_facet |
Ahmad Afif, Mohd Faudzi Hirotaka, Takano Junichi, Murata |
author_sort |
Ahmad Afif, Mohd Faudzi |
title |
A Study on Abstract Policy for Acceleration of Reinforcement Learning |
title_short |
A Study on Abstract Policy for Acceleration of Reinforcement Learning |
title_full |
A Study on Abstract Policy for Acceleration of Reinforcement Learning |
title_fullStr |
A Study on Abstract Policy for Acceleration of Reinforcement Learning |
title_full_unstemmed |
A Study on Abstract Policy for Acceleration of Reinforcement Learning |
title_sort |
study on abstract policy for acceleration of reinforcement learning |
publishDate |
2014 |
url |
http://umpir.ump.edu.my/id/eprint/7452/1/A_Study_on_Abstract_Policy_for_Acceleration_of_Reinforcement_Learning.pdf http://umpir.ump.edu.my/id/eprint/7452/ http://dx.doi.org/10.1109/SICE.2014.6935300 |
_version_ |
1643665621732294656 |
score |
13.211869 |