Designing and Improving Techniques f or Underwater Visual SLAM and Obstacle Avoidance
Abstract
This thesis focuses on designing and improving techniques for visual simultaneous localization and mapping (Visual SLAM or VSLAM) and obstacle avoidance, in the underwater environment.
From the perspective of underwater robotics, the underwater environment is rich in opportunities and dangers: examples of opportunities are climate change monitoring, flora and fauna mapping, biological studies and preservation, mining, and extraction of natural resources, and monitoring of underwater infrastructure, while threats are typically represented by obstacles, rough seas and currents, nottraversable fields, and marine fauna.
While sonars can provide information about physical elements present around the robot, they cannot capture color and semantic information, making it hard or impossible to achieve a certain level of autonomy. It becomes clear that the robot needs to perceive the world also through a camera.
Visual SLAM is the process of utilizing a camera to map the environment and at the same time localize itself in it, all of this in soft real-time. This is crucial as such information can be exploited to perform automatic re-routing and obstacle avoidance.
This thesis presents a stereo-camera-based obstacle avoidance field trial that demonstrates how a stereo camera, with proper active illumination, can be utilized instead of an acoustic-based sensor to perform active obstacle avoidance and then presents a series of methods to improve the robustness and performance of underwater VSLAM.
In particular, an emphasis is placed on monocular VSLAM, as monocular VSLAM is highly interesting in terms of robustness: a monocular VSLAM system can substitute a stereo VSLAM system in case of a single camera malfunction, it is superior in terms of compactness, as a single camera is easier and cheaper to place then two, and sometimes a single camera is the only available option for small robots.
This thesis presents a single image-only, standalone, and global method for robust loop closure detection, based on the encoded representation produced by a convolutional autoencoder. It also presents a method for keypoint rejection for feature-based VSLAM methods which avoid features to be detected on unsuitable surfaces, such as loose seaweed, marine fauna, and caustics. Finally, it presents a series of modifications to ORB-SLAM 2, one of the most successful feature-based monocular SLAM, in order to improve it for use in the underwater environment.
These modifications include a slight change to the initialization procedure, a way to perform feature matching which yields a higher amount of valid matches, a partial synchronization between the front-end and the back-end, a procedure to detect station keeping without the need to know the scale of the movements and a pruning procedure which enables lifelong operations.
This thesis is edited as a collection of papers.
Has parts
Paper A: Leonardi, Marco; Stahl, Annette; Ludvigsen, Martin; Nornes, Stein Melvær; Gazzea, Michele; Rist-Christensen, Ida. Vision based obstacle avoidance and motion tracking for autonomous behaviors in underwater vehicles. OCEANS 2017; 2017-06-19 - 2017-06-22Paper B: Leonardi, Marco; Stahl, Annette. Convolutional Autoencoder aided loop closure detection for monocular SLAM. I: 11th IFAC Conference on Control Applications in Marine Systems, Robotics, and Vehicles CAMS 2018 Opatija, Croatia, 10–12 September 2018. Elsevier 2018 ISBN 0000000000. s. 159-164
Paper C: Leonardi, Marco; Fiori, Luca; Stahl, Annette. Deep learning based keypoint rejection system for underwater visual ego-motion estimation. IFAC-PapersOnLine 2021 ;Volum 53.(2) s. 9471-9477
Paper D: Leonardi, Marco; Stahl, Annette; Brekke, Edmund Førland; Ludvigsen, Martin. A Robust Monocular Visual SLAM System for Lifelong Underwater Operations. This paper is not yet published and is therefore not included.