Show simple item record

dc.contributor.advisorSchjølberg, Ingrid
dc.contributor.advisorLekkas, Anastasios
dc.contributor.authorKhan, Abir
dc.date.accessioned2018-09-25T14:03:23Z
dc.date.available2018-09-25T14:03:23Z
dc.date.created2018-06-21
dc.date.issued2018
dc.identifierntnudaim:19456
dc.identifier.urihttp://hdl.handle.net/11250/2564506
dc.description.abstractThis thesis introduces the use of Machine Learning, specifically Reinforcement Learning, to create a model-free tracking property for Remotely Operated Vehicles (ROV). In detail, the ROV is trained by a RL algorithm to track an aruco marker, using online implementation of a Computer Vision (CV) algorithm as a detection property. The main motivation behind this enterprise is the contribution to increased autonomy in underwater operations, by introducing model-free autonomous tracking behavior to underwater vehicles. This approach of implementation requires minimal human intervention during operation, while significantly reducing prior human control programming effort. Firstly, a simulator based tracking behavior training of the ROV was done prior to conducting physical experiments with a real ROV in the MC-laboratory at NTNU. The ROV used for the experimental tests is a BlueROV2, which is highly customizable and fitting for R&D purposes. The theory presented in this thesis lays the groundwork for the many reasonings done in this project s course, including the choice of RL method. The RL algorithm chosen for training the tracking behavior is a online Python implementation of the type Proximal Policy Optimization (PPO) algorithm. The tracking behavior is trained on a simulator, which is a Python script based on typical OpenAI s simulator architecture. The resulting tracking performance is then evaluated by studying the evolution of accumulated rewards and ROV s trajectory plots. While the resulting performance did show to have some weak sides, it was, however, feasible enough to test the trained model in a real-world setting. However, the real-world experiments did not yield positive tracking results, considering the ROV performed in a random manner instead of favorably moving towards the aruco marker. Several challenges described in the theory-section proved to be prevalent during the lab experiments, which caused the disruption in the real-world tracking performance. Nonetheless, based on experience gained from both the simulations and real-world experiments, various proposals for further work was devised and highlighted. Especially, is the importance of appropriate reward function design underlined.
dc.languageeng
dc.publisherNTNU
dc.subjectMarin teknikk, Marin kybernetikk
dc.titleDeep Reinforcement Learning based tracking behavior for underwater vehicles
dc.typeMaster thesis


Files in this item

Thumbnail
Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record