Expert Q-learning: Deep Reinforcement Learning with Coarse State Values from Offline Expert Examples

Meng, Li; Yazidi, Anis; Goodwin, Morten; Engelstad, Paal

dc.contributor.author	Meng, Li
dc.contributor.author	Yazidi, Anis
dc.contributor.author	Goodwin, Morten
dc.contributor.author	Engelstad, Paal
dc.date.accessioned	2023-01-25T10:37:49Z
dc.date.available	2023-01-25T10:37:49Z
dc.date.created	2023-01-19T11:55:03Z
dc.date.issued	2022
dc.identifier.citation	Proceedings of the Northern Lights Deep Learning Workshop. 2022, .	en_US
dc.identifier.uri	https://hdl.handle.net/11250/3046199
dc.description.abstract	In this article, we propose a novel algorithm for deep reinforcement learning named Expert Q-learning. Expert Q-learning is inspired by Dueling Q-learning and aims to incorporate semi-supervised learning into reinforcement learning through splitting Q-values into state values and action advantages. We require that an offline expert assesses the value of a state in a coarse manner using three discrete values. An expert network is designed in addition to the Q-network, which updates each time following the regular offline minibatch update whenever the expert example buffer is not empty. Using the board game Othello, we compare our algorithm with the baseline Q-learning algorithm, which is a combination of Double Q-learning and Dueling Q-learning. Our results show that Expert Q-learning is indeed useful and more resistant to the overestimation bias. The baseline Q-learning algorithm exhibits unstable and suboptimal behavior in non-deterministic settings, whereas Expert Q-learning demonstrates more robust performance with higher scores, illustrating that our algorithm is indeed suitable to integrate state values from expert examples into Q-learning.	en_US
dc.language.iso	eng	en_US
dc.publisher	Septentrio Academic Publishing	en_US
dc.rights	Navngivelse 4.0 Internasjonal	*
dc.rights.uri	http://creativecommons.org/licenses/by/4.0/deed.no	*
dc.title	Expert Q-learning: Deep Reinforcement Learning with Coarse State Values from Offline Expert Examples	en_US
dc.title.alternative	Expert Q-learning: Deep Reinforcement Learning with Coarse State Values from Offline Expert Examples	en_US
dc.type	Peer reviewed	en_US
dc.type	Journal article	en_US
dc.description.version	publishedVersion	en_US
dc.source.pagenumber	9	en_US
dc.source.volume	3	en_US
dc.source.journal	Proceedings of the Northern Lights Deep Learning Workshop	en_US
dc.identifier.doi	10.7557/18.6237
dc.identifier.cristin	2110224
cristin.ispublished	true
cristin.fulltext	original
cristin.qualitycode	1

Files in this item

Name:: Expert_Q-learning.pdf
Size:: 1.033Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Institutt for datateknologi og informatikk [6569]
Publikasjoner fra CRIStin - NTNU [37384]

Show simple item record

Except where otherwise noted, this item's license is described as Navngivelse 4.0 Internasjonal