Variance-Based Exploration for Learning Model Predictive Control

Seel, Katrine; Bemporad, Alberto; Gros, Sebastien Nicolas; Gravdahl, Jan Tommy

dc.contributor.author	Seel, Katrine
dc.contributor.author	Bemporad, Alberto
dc.contributor.author	Gros, Sebastien Nicolas
dc.contributor.author	Gravdahl, Jan Tommy
dc.date.accessioned	2023-11-29T08:37:31Z
dc.date.available	2023-11-29T08:37:31Z
dc.date.created	2023-06-06T14:08:10Z
dc.date.issued	2023
dc.identifier.citation	IEEE Access. 2023, 11 60724-60736.	en_US
dc.identifier.issn	2169-3536
dc.identifier.uri	https://hdl.handle.net/11250/3105167
dc.description.abstract	The combination of model predictive control (MPC) and learning methods has been gaining increasing attention as a tool to control systems that may be difficult to model. Using MPC as a function approximator in reinforcement learning (RL) is one approach to reduce the reliance on accurate models. RL is dependent on exploration to learn, and currently, simple heuristics based on random perturbations are most common. This paper considers variance-based exploration in RL geared towards using MPC as function approximator. We propose to use a non-probabilistic measure of uncertainty of the value function approximator in value-based RL methods. Uncertainty is measured by a variance estimate based on inverse distance weighting (IDW). The IDW framework is computationally cheap to evaluate and therefore well-suited in an online setting, using already sampled state transitions and rewards. The gradient of the variance estimate is then used to perturb the policy parameters in a direction where the variance of the value function estimate is increasing. The proposed method is verified on two simulation examples, considering both linear and nonlinear system dynamics, and compared to standard exploration methods using random perturbations.	en_US
dc.language.iso	eng	en_US
dc.publisher	IEEE	en_US
dc.rights	Navngivelse 4.0 Internasjonal	*
dc.rights.uri	http://creativecommons.org/licenses/by/4.0/deed.no	*
dc.title	Variance-Based Exploration for Learning Model Predictive Control	en_US
dc.title.alternative	Variance-Based Exploration for Learning Model Predictive Control	en_US
dc.type	Peer reviewed	en_US
dc.type	Journal article	en_US
dc.description.version	publishedVersion	en_US
dc.source.pagenumber	60724-60736	en_US
dc.source.volume	11	en_US
dc.source.journal	IEEE Access	en_US
dc.identifier.doi	10.1109/ACCESS.2023.3282842
dc.identifier.cristin	2152302
dc.relation.project	Norges forskningsråd: 294544	en_US
dc.relation.project	Norges forskningsråd: 300172	en_US
cristin.ispublished	true
cristin.fulltext	original
cristin.qualitycode	1

Tilhørende fil(er)

Filnavn:: Variance-Based_Exploration_for ...
Størrelse:: 1.807Mb
Format:: PDF

Åpne

Denne innførselen finnes i følgende samling(er)

Institutt for datateknologi og informatikk [6582]
Institutt for teknisk kybernetikk [3684]
Publikasjoner fra CRIStin - NTNU [37459]

Vis enkel innførsel

Med mindre annet er angitt, så er denne innførselen lisensiert som Navngivelse 4.0 Internasjonal