Vis enkel innførsel

dc.contributor.advisorStahl, Annette
dc.contributor.advisorMathiassen, John Reidar
dc.contributor.authorDyrstad, Jonatan Sjølund
dc.date.accessioned2023-08-24T12:29:38Z
dc.date.available2023-08-24T12:29:38Z
dc.date.issued2023
dc.identifier.isbn978-82-326-7231-8
dc.identifier.issn2703-8084
dc.identifier.urihttps://hdl.handle.net/11250/3085661
dc.description.abstractFlexible robots, capable of manipulating objects in unstructured environments under changing conditions, will lead to a paradigm shift in automation. Such robotic solutions can potentially transform entire industries currently subject to very little or no automation and they will eventually also have a profound impact on our daily lives. Today’s robotic solutions are not good at coping with large degrees of variability. These are highly specialized machines that typically operate in structured environments, which is designed to cater to the robot’s strengths. The robots’ inability to handle unstructured environments and cluttered scenery makes them unsuited for many real-world manipulation tasks. Many of these tasks are inherently cluttered and subject to large variations, such as tasks involving manipulation of raw materials or sorting of objects. Furthermore, automating these tasks requires developing highly specialized machinery that comes with costly and long development cycles. This is a limiting factor in today’s application of robotics to automation. To handle unstructured and cluttered domains robots need to sense and reason about their environment in order to select and execute an appropriate action, given the situation. This necessitates a visual processing system capable of extracting relevant information from the environment at high speeds. The goal of this thesis has been to contribute towards the development of a visual processing system suitable for real-world robotic manipulation tasks. We approach this goal by first considering the task of open-loop grasping. To predict precise, collision-free grasps, the system needs to extract precise features in the face of noisy data. Further, robotic grasping is a highly relevant automation task across many industries and there is a need for a robust grasping system, which can handle noisy data and large variability in the appearance of objects. The result of our work is a visual processing system that can be trained to process large point clouds by sequential focusing of attention. By learning to attend to the relevant parts of the volume, the proposed system can extract high-precision features from large volumes at high speeds. In our work on grasping, we consider the task of bin-picking of fish, which is a difficult task, subject to clutter and noisy depth measurements. With the proposed visual processing system we achieve a 95 % grasp success rate on this task and the system is able to successfully correct its’ own mistakes by trying again. Further, the system is trained solely on synthetically generated data sets and a generic pipeline for generation of such data sets for grasping has been developed. We envision that this approach might enable significantly shorter development cycles for new robotic applications and enable easy repurposing of robots for new tasks. In turn, we hope that this can open up for more automation in domains previously less suited for automation, such as producers dealing with smaller quanta or more varied materials or raw-materials or productions subject to seasonal variability. The proposed system can process arbitrarily large volumes with a processing speed of 15 Hz. This indicates that the system can be used for closed-loop control, which can enable learning of more complex robot actions and sequences of actions. In this thesis we present the results of preliminary tests that were designed to test the system’s capabilities in real-time applications. In these tests we considered two simple visual servoing tasks and trained the system with behavioural cloning. Through these two experiments, the system has proven capable of learning how to effectively summarize the contents of larger volumes through the attention mechanism and use this summary for subsequent decision making. When dealing with sequences of actions it is also able to infer the context based on the observations and shift its’ focus of attention upon completion of a sub-task. However, these are only preliminary tests and the limits of the current system for tasks involving more complex relationships between objects and long-term memory are still unknown. Further, more research is needed in order to achieve robust robotic control-policies based on the extracted features, as the trained policies in this work suffer from the distribution shift-problem typical when training with behavioral cloning.en_US
dc.language.isoengen_US
dc.publisherNTNUen_US
dc.relation.ispartofseriesDoctoral theses at NTNU;2023:266
dc.relation.haspartPaper A: Dyrstad, Jonatan Sjølund; Mathiassen, John Reidar. Grasping Virtual Fish: A Step Towards Robotic Deep Learning from Demonstration in Virtual Reality 2017 IEEE International Conference on Robotics and Biomimetics (ROBIO) pp. 1181–1187 https://doi.org/10.1109/ROBIO.2017.8324578en_US
dc.relation.haspartPaper B: Dyrstad, Jonatan Sjølund; Bakken, Marianne; Grøtli, Esten Ingar; Schulerud, Helene; Mathiassen, John Reidar. Bin Picking of Reflective Steel Parts using a Dual-Resolution Convolutional Neural Network Trained in a Simulated Environment 2018 IEEE International Conference on Robotics and Biomimet- ics (ROBIO) pp. 530–537. https://doi.org/10.1109/ROBIO.2018.8664766en_US
dc.relation.haspartPaper C: Dyrstad, Jonatan Sjølund; Øye, Elling Ruud; Stahl, Annette; Mathiassen, John Reidar. Teaching a Robot to Grasp Real Fish by Imitation Learning from a Human Supervisor in Virtual Reality 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) pp. 7185–7192. https://doi.org/10.1109/IROS.2018.8593954en_US
dc.relation.haspartPaper D: Dyrstad, Jonatan Sjølund; Øye, Elling Ruud; Stahl, Annette; Mathiassen, John Reidar. Robotic grasping in arbitrarily sized volumes with recurrent attention models.
dc.titleRobot learning with visual processing in arbitrarily sized, high resolution volumesen_US
dc.typeDoctoral thesisen_US
dc.subject.nsiVDP::Technology: 500::Information and communication technology: 550::Technical cybernetics: 553en_US


Tilhørende fil(er)

Thumbnail
Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel