Vis enkel innførsel

dc.contributor.authorSadanandan Anand, Akhil
dc.contributor.authorSawant, Shambhuraj Vijaysinh
dc.contributor.authorGros, Sebastien Nicolas
dc.contributor.authorGravdahl, Jan Tommy
dc.date.accessioned2024-01-08T08:33:34Z
dc.date.available2024-01-08T08:33:34Z
dc.date.created2024-01-03T16:32:48Z
dc.date.issued2023
dc.identifier.isbn978-3-907144-08-4
dc.identifier.urihttps://hdl.handle.net/11250/3110304
dc.description.abstractThe combination of Reinforcement Learning (RL) and Model Predictive Control (MPC) has gained a lot of interest in the recent literature as a way of computing the optimal policies from MPC schemes based on inaccurate models. In that context, the Deterministic Policy Gradient (DPG) methods are often observed to be the most reliable class of RL methods to improve the MPC closed-loop performance. The DPG methods are fairly easy to formulate when used with compatible function approximation as an advantage function. However, this formulation requires an additional value function approximation, often carried out using Deep Neural Networks (DNNs). In this paper, we propose to estimate the required value function approximation as a first-order expansion of the value function estimate from the MPC scheme providing the policy. The proposed approach drastically simplifies the use of DPG methods for learning-based MPC as no additional structure for approximating the value function needs to be constructed. We illustrate the proposed approach with two numerical examples of varying complexity.en_US
dc.language.isoengen_US
dc.publisherIEEEen_US
dc.relation.ispartofProceedings of 2023 European Control Conference (ECC)
dc.titleA Painless Deterministic Policy Gradient Method for Learning-based MPCen_US
dc.title.alternativeA Painless Deterministic Policy Gradient Method for Learning-based MPCen_US
dc.typeChapteren_US
dc.description.versionpublishedVersionen_US
dc.identifier.cristin2220176
cristin.ispublishedtrue
cristin.fulltextoriginal
cristin.qualitycode1


Tilhørende fil(er)

Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel