Trustworthy Machine Learning for Controlled Dynamic Systems

Robinson, Haakon Rennesvik

dc.contributor.advisor	Rasheed, Adil
dc.contributor.advisor	Varagnolo, Damiano
dc.contributor.author	Robinson, Haakon Rennesvik
dc.date.accessioned	2023-09-04T12:34:00Z
dc.date.available	2023-09-04T12:34:00Z
dc.date.issued	2023
dc.identifier.isbn	978-82-326-7039-0
dc.identifier.issn	2703-8084
dc.identifier.uri	https://hdl.handle.net/11250/3087319
dc.description.abstract	Recent advances in machine learning (ML) have helped solve many problems once thought impossible for computers to solve. These methods are increasingly used to develop flexible agents that learn to model and control through direct interaction with real-world systems. Despite encouraging results, it is difficult to predict the behaviour of such agents, which prevents their use in safety-critical applications. This thesis is a broad investigation into the safe application of machine learning methods to three different problems in the context of controlled dynamical systems, namely (i) Control design, (ii) System identification, and (iii) Verification. One of the main ways to realise a learning controller is through Reinforcement learning (RL), where a learning agent is rewarded when it reaches a goal or performs a desired behaviour. A variety of RL methods are applied to the problems of pathfollowing and collision avoidance (COLAV) for an autonomous ship. These two goals are occasionally in conflict. A parameter controlling the tradeoff between path-following and COLAV is given to the agent as an additional “insight”, allowing the agent’s priorities to be controlled during operation. However, even though the “conservativeness” of the agent is controllable using this scheme, experiments show that collisions can still occur on rare occasions, thus confirming the need for improved safety measures when using machine learning (ML) methods. A guarantee of safety can be achieved by introducing a failsafe controller known to be safe, a so-called safety filter. The second contribution of this thesis is the implementation of a predictive safety filter for the milliAmpere ferry. This auxiliary system is formulated as an optimal control problem (OCP), where the optimal solution is a minimal perturbation to a nominal input to the system such that the safety constraints are satisfied. The safety filter can be used with arbitrary controllers (e.g. RL agents) while guaranteeing safe operation. System identification is closely related to ML. In both cases, a set of model structures is chosen, the parameters are selected such that the predictions match the available data without overfitting, and the model candidates are evaluated according to task-specific metrics. Model structures are often designed from first principles following physical laws, an approach referred to as physics-based modelling (PBM) in this thesis. PBM becomes more challenging as the complexity or scale of the system increases, and assumptions typically have to be made to make the resulting model tractable. Modern ML practitioners are increasingly turning to more flexible model structures such as neural networks (NNs) that can scale well without the need for bespoke model structures; this is referred to as Data-driven modelling (DDM) in this thesis. The combination of PBM and DDM techniques is also investigated in this thesis. We propose the physics-guided neural network (PGNN), a novel NN architecture where physics-based priors are injected into the intermediate layers of a NN. This method is found to improve accuracy and generalisation on a variety of dynamical system modelling tasks. However, the choice of injection layer plays a significant role, and there is no principled way to choose the correct layer a priori. Another way to augment PBM with DDM is through “boosting”, i.e. training a second model to correct the errors of the first. This method is also known as the Corrective source term approach (CoSTA), and we apply it to an ablated model of an aluminium electrolysis cell and show that the corrected model is more accurate and stable than a purely data-driven model. Stability is a common issue when using NNs to model dynamical systems. This thesis investigates how regularisation and network architecture can affect the stability of the resulting system. The results show that introducing ℓ1 regularisation and skip connections can significantly improve the predictive stability of NNs and that these measures are most effective when used together. These effects persist even when the amount of training data is reduced. The third and last part of this thesis studies NNs as piecewise affine (PWA) systems, which is an exact correspondence when the only nonlinearities present are PWA functions. The work represents a step towards practical algorithms that can be used to verify the safety of black-box models. The first contribution is a memory-efficient algorithm for computing the linear pieces of a NN. However, the number of pieces grows exponentially with the network’s depth and the input space’s dimension, limiting the method to relatively small networks. Despite this, studying smaller systems can still yield insights. A series of experiments were performed on NNs trained to mimic a damped pendulum with different forms of regularisation. The linear regions of the network are recorded regularly during the training process. It is found that ℓ1 regularisation significantly reduces the apparent number of regions, and a simple mechanism is proposed to explain this. Regularising using the ℓ2 norm has a similar but lesser effect to ℓ1 regularisation. Dropout regularisation is not found to change the number of regions significantly but instead affects the structure of the regions. Weight normalisation is found to negate the observed effects. One motivation for these methods is that reducing the number of regions of a NN may make verification methods tractable. However, regularisation by itself appears insufficient. Instead, an additional algorithm to discard insignificant regions is proposed. A nonlinear benchmark based on modelling the vibrations of a wing/payload system is chosen as a case study. It is shown that regularisation can significantly reduce the number of regions at the cost of accuracy. Weight pruning is shown to have little effect. In comparison, the PWA approximation algorithm runs efficiently on a network with ten inputs and sacrifices little accuracy.	en_US
dc.language.iso	eng	en_US
dc.publisher	NTNU	en_US
dc.relation.ispartofseries	Doktoravhandlinger ved NTNU;2023:168
dc.title	Trustworthy Machine Learning for Controlled Dynamic Systems	en_US
dc.type	Doctoral thesis	en_US
dc.subject.nsi	VDP::Teknologi: 500::Informasjons- og kommunikasjonsteknologi: 550::Teknisk kybernetikk: 553	en_US
dc.description.localcode	Figure 3.13, which is a picture of the milliAmpere ferry is published under a Creative Commons lisence.	en_US

Tilhørende fil(er)

Filnavn:: Haakon Rennesvik Robinson.pdf
Størrelse:: 19.91Mb
Format:: PDF

Åpne

Denne innførselen finnes i følgende samling(er)

Institutt for teknisk kybernetikk [3674]

Vis enkel innførsel