Engineering Applications of Model Predictive Control-based Reinforcement Learning

Cai, Wenqi

Cai, Wenqi

Doctoral thesis

Åpne

Wenqi Cai.pdf (5.221Mb)

Permanent lenke

https://hdl.handle.net/11250/3151159

Utgivelsesdato

2024

Metadata

Vis full innførsel

Samlinger

Institutt for teknisk kybernetikk [3776]

Sammendrag

Model Predictive Control (MPC) has emerged as a highly influential control strategy, leveraging models of real system dynamics to generate input-state sequences that minimize costs under certain constraints. However, building an accurate MPC model, especially for stochastic systems, remains a significant challenge, leading to potential performance degradation. The integration of Machine Learning (ML) in Data-driven Model Predictive Control (DMPC) aimed to alleviate this issue has brought its own problems. Specifically, modeling in DMPC is often disconnected from the control objectives because ML-based models focus on predictions rather than MPC performance, which can lead to significantly suboptimal policies.

Reinforcement Learning (RL), a model-free approach, has arisen as a promising tool, with a core advantage in learning policies through interaction with the environment. Although does not rely on system models, conventional RL methods are known to suffer from extensive data requirements, a lack of formal tools to satisfy system constraints, and challenges related to the parameterization of Deep Neural Network (DNN).

This thesis focuses on an innovative Model Predictive Control-based Reinforcement Learning (MPC-based RL) method that amalgamates the strengths of MPC and RL, compensating for the shortcomings of both. The approach focuses on parameterizing the MPC model, cost, and constraints, applying RL to tune these parameters to minimize the closed-loop performance. This fusion leads to a method that not only takes advantage of prior knowledge but also considers system constraints, analyzes stability, overcomes uncertainties, and deals with long-term or even infinite-horizon problems. The applicability and effectiveness of the MPC-based RL approach are demonstrated through three engineering applications, each characterized by no exact model, high uncertainty, or economic cost function.

• Autonomous Surface Vehicle (ASV): By applying the MPC-RL method to ASVs, the thesis demonstrates its efficacy in optimizing a simplified freight mission with constraints like collision-free path tracking and autonomous docking. Simulations showed an improvement in closed-loop performance.

• Energy Management in Residential Microgrids: In a more complex scenario involving fluctuating spot-market prices and uncertainties, the MPC-based RL approach effectively optimized benefits for residential microgrid systems, greatly cutting economic costs while ensuring user comfort. The application also introduced the Shapley value method for equitable bill distribution among residents.

• Home Energy Management System (HEMS): The third application tackled a real-world problem in HEMS, dealing with discrepancies due to model mismatch and uncertainties in various parameters. The MPC-based RL approach was shown to deliver policies satisfying thermal comfort and economic costs, even with inaccurate models derived from model fitting.

In summary, the thesis contributes a nuanced understanding of the potential synergies between MPC and RL, unveiling an approach that transcends the boundaries of conventional methods. By applying and slightly modifying or improving the MPC-based RL method to formulate different algorithms across the three different applications, the research verifies its theoretical merits, proposes new solutions to challenging engineering problems, and identifies potential methodological issues.

Utgiver

NTNU

Serie

Doctoral theses at NTNU;2024:288