Maritime Object Detection in LWIR-images using Deep Learning methods with Data Augmentation

Kjønås, Ingunn

dc.contributor.advisor	Brekke, Edmund Førland
dc.contributor.advisor	Mester, Rudolf
dc.contributor.advisor	Eide, Egil
dc.contributor.author	Kjønås, Ingunn
dc.date.accessioned	2021-10-12T17:22:45Z
dc.date.available	2021-10-12T17:22:45Z
dc.date.issued	2021
dc.identifier	no.ntnu:inspera:77039769:51805054
dc.identifier.uri	https://hdl.handle.net/11250/2789448
dc.description.abstract	Konteksten for denne oppgaven er bruk av multisensor målfølging av flere mål for et autonomt overflatefartøy i et havneområde. Hensikten med forskningen er å lokalisere og målfølge en autonom ferge kombinert med andre mål og hindringer for å unngå kollisjon i autonom navigasjon. Det endelige målet er å forbedre robustheten og påliteligheten til målfølgingssystemet ved hjelp av sensorfusjon. Infrarøde kameraer kan forbedre nattsynet og oppløsning, i tillegg til å gi mer informasjon knyttet til målets egenskaper. Derfor fokuserer denne oppgaven på deteksjon av objekter i infrarøde bilder. For å adressere dette gjennomgås relevant litteratur som dekker tilnærming til deteksjon av objekter i maritime, infrarøde bilder, med spesielt fokus på nevrale nettverk og teknikker for forøkning av data. Det finnes få tilgjengelige annoterte langbølge-infrarød bilder av båter, slik at mer data er samlet inn og annotert med mål om å trene og teste nevrale nettverk ved bruk av disse bildene. De nevrale nettverksmodellene YOLOv3 og EfficientDet-D0 er trent of testet på tilgjengelig og innsamlet data, og ytelsen deres er sammenlignet. Dataforøknings-teknikker blir ofte brukt i det generelle datasyn domenet for å øke variasjonene i treningsdataen, men ingen studier er så langt utført for å undersøke effekten på maritime langbølge-infrarød bilder. På grunn av dette og kombinert med et størrelsesbegrenset datasett, blir den potensielle forbedringseffekten av dataforøkning under trening av de nevrale nettverkene testet i denne oppgaven. Resultatene viser at begge modellene presterer bra med en deteksjonssannsynlighet på 100% for to båter i bevegelse når pikselarealet for båtene er over 1800. For mindre objekter blir deteksjonsresultatene betraktelig dårligere, hvilket viser at det er en avstandsgrense for deteksjon av målene i de infrarøde bildene. Sammenligning av de to modellene viser at YOLOv3 presterer litt bedre på deteksjon av små objekter, selv om effekten er for liten til å konkludere med at en modell er bedre enn den andre. Effekten av å kombinere dataforøkningsteknikkene vending, skalering og mosaikk er signifikant forbedring av resultatene for begge modeller, hvor mosaikk gir den største forbedringen. Når deteksjonene skal brukes til kollisjonsunngåelse kan det være nyttig å hente ut informasjon knyttet til type båt, noe som kan brukes blant annet til å estimere hastighet og vinkel på målet. For å undersøke mulightene til å skille motorbåter fra seilbåter er de nevrale nettverksmodellene testet med deteksjon og klassifikasjon kombinert. Dette resulterer i lovende ytelse, selv om misklassifiseringer er vanlige og det fører til flere falskt positive prediksjoner sammenlignet med trening på én båt-klasse alene.
dc.description.abstract	The context of this project is the use of multisensor multitarget tracking for an Autonomous Surface Vehicle (ASV) in a harbour environment. The purpose of the research is to locate and track an ASV in combination with other targets for collision avoidance in autonomous navigation. The final goal is to improve robustness and reliability of the tracking system by means of sensor fusion. Infrared cameras can improve night vision, improve resolution and provide more feature information. Therefore, this thesis focuses on detection performance in infrared images. To address this, a literature review is conducted covering approaches to object detection in maritime infrared images, with focus given to neural networks and data augmentation techniques. There are few available annotated Long Wave Infrared (LWIR) images of boats, therefore more images are collected and annotated with the purpose of training and testing neural networks on the data. The neural network models YOLOv3 and EfficientDet-D0 are trained and tested on the available and collected data and their performance is compared. Data augmentation is a frequently used technique in the general computer vision community in order to increase the variation in the training data, but no studies have previously examined the effect on maritime LWIR images. Because of this and motivated by the limited available dataset, the effect of data augmentation during training of the neural networks is examined in this thesis. The results show that both models perform well with a probability of detection of 100% for two moving target boats when the pixel area size is above a threshold of 1800. For smaller objects, the detection performance is significantly reduced, showcasing a limited range of infrared camera object detection. The comparison of the models shows that YOLOv3 performing slightly better for smaller targets, although the effect is to small to conclude that one model is superior to the other. The effect of the combined data augmentation techniques flip, scale and mosaic is significant increase in performance for both models, with mosaic providing the greatest improvement. Finally, for the application of collision avoidance it can be useful to extract information related to the type of boat, which can be used for instance for estimation of velocity and heading. To test the possibility of separating motorboats from sailboats, the neural networks are tested with detection and classification combined, resulting in promising performance, although misclassifiactions are common and more false positive predictions are introduced than when training on one boat-class.
dc.language	eng
dc.publisher	NTNU
dc.title	Maritime Object Detection in LWIR-images using Deep Learning methods with Data Augmentation
dc.type	Master thesis

Tilhørende fil(er)

Filnavn:: no.ntnu:inspera:77039769:51805 ...
Størrelse:: 22.06Mb
Format:: PDF

Åpne

Denne innførselen finnes i følgende samling(er)

Institutt for elektroniske systemer [2289]

Vis enkel innførsel