Deep Reinforcement Learning based tracking behavior for underwater vehicles

Khan, Abir

dc.contributor.advisor	Schjølberg, Ingrid
dc.contributor.advisor	Lekkas, Anastasios
dc.contributor.author	Khan, Abir
dc.date.accessioned	2018-09-25T14:03:23Z
dc.date.available	2018-09-25T14:03:23Z
dc.date.created	2018-06-21
dc.date.issued	2018
dc.identifier	ntnudaim:19456
dc.identifier.uri	http://hdl.handle.net/11250/2564506
dc.description.abstract	This thesis introduces the use of Machine Learning, specifically Reinforcement Learning, to create a model-free tracking property for Remotely Operated Vehicles (ROV). In detail, the ROV is trained by a RL algorithm to track an aruco marker, using online implementation of a Computer Vision (CV) algorithm as a detection property. The main motivation behind this enterprise is the contribution to increased autonomy in underwater operations, by introducing model-free autonomous tracking behavior to underwater vehicles. This approach of implementation requires minimal human intervention during operation, while significantly reducing prior human control programming effort. Firstly, a simulator based tracking behavior training of the ROV was done prior to conducting physical experiments with a real ROV in the MC-laboratory at NTNU. The ROV used for the experimental tests is a BlueROV2, which is highly customizable and fitting for R&D purposes. The theory presented in this thesis lays the groundwork for the many reasonings done in this project s course, including the choice of RL method. The RL algorithm chosen for training the tracking behavior is a online Python implementation of the type Proximal Policy Optimization (PPO) algorithm. The tracking behavior is trained on a simulator, which is a Python script based on typical OpenAI s simulator architecture. The resulting tracking performance is then evaluated by studying the evolution of accumulated rewards and ROV s trajectory plots. While the resulting performance did show to have some weak sides, it was, however, feasible enough to test the trained model in a real-world setting. However, the real-world experiments did not yield positive tracking results, considering the ROV performed in a random manner instead of favorably moving towards the aruco marker. Several challenges described in the theory-section proved to be prevalent during the lab experiments, which caused the disruption in the real-world tracking performance. Nonetheless, based on experience gained from both the simulations and real-world experiments, various proposals for further work was devised and highlighted. Especially, is the importance of appropriate reward function design underlined.
dc.language	eng
dc.publisher	NTNU
dc.subject	Marin teknikk, Marin kybernetikk
dc.title	Deep Reinforcement Learning based tracking behavior for underwater vehicles
dc.type	Master thesis

Tilhørende fil(er)

Filnavn:: 19456_FULLTEXT.pdf
Størrelse:: 6.286Mb
Format:: PDF

Åpne

Filnavn:: 19456_ATTACHMENT.zip
Størrelse:: 848.6Kb
Format:: application/zip

Åpne

Filnavn:: 19456_COVER.pdf
Størrelse:: 1.556Mb
Format:: PDF

Åpne

Denne innførselen finnes i følgende samling(er)

Institutt for marin teknikk [3436]

Vis enkel innførsel