Improved Sheep Detection - Modifying YOLOv5 to accurately detect grazing sheep in UAV imagery

Nygård, Ingebrigt; Vittersø, Sebastian

dc.contributor.advisor	Hvasshovd, Svein-Olaf
dc.contributor.author	Nygård, Ingebrigt
dc.contributor.author	Vittersø, Sebastian
dc.date.accessioned	2022-12-22T18:19:32Z
dc.date.available	2022-12-22T18:19:32Z
dc.date.issued	2022
dc.identifier	no.ntnu:inspera:112046434:26444326
dc.identifier.uri	https://hdl.handle.net/11250/3039298
dc.description.abstract	Sauebønder i Norge har et sårt behov for modernisering. Bøndene bruker ofte utmark som beiteområder for sauene sine, og i beitesesongen er bonden lovpålagt å gjennomføre ukentlige inspeksjoner av besetningen, noe som i dag er en manuell oppgave. Ved sesongslutt skal sauene lokaliseres og samles, og bonden trenger ofte bistand fra familie, venner og naboer til dette. De ukentlige inspeksjonene og lokaliseringen av sau er tidkrevende, men en hjelp for bonden kan være å ta i bruk autonome droner, kombinert med de automatiske bildebehandlingsmulighetene i moderne maskinlæring. Denne masteroppgaven fokuserer på maskinlæringsaspektene ved et system som løser denne utfordringen. Den vil vurdere hvilke endringer som kan gjøres i objektdeteksjonsnettverkene for å forbedre nettverkets resultater ytterligere. All testing gjøres med variasjoner av nettverksarkitekturen YOLOv5, brukt på et datasett med relevante RGB- og IR-bilder av sauer, tatt av en fjernstyrt drone. Gjennom grundig undersøkelse tester oppgaven effekten tre forskjellige variasjoner har på resultatene. For det første har modellstørrelsen, en redigerbar parameter i YOLOv5-rammeverket, blitt variert for å se i hvilken grad de reduserte datakravene til en mindre modell vil forringe nøyaktigheten til prediksjonene som er gjort. For det andre gjøres det endringer i modellarkitekturen for å kunne inkludere IR data. To nye varianter foreslås: En bruker en fusjonsbasert arkitektur som behandler RGB- og IR-bildet i separate pipelines før den fusjonerer dataene etter backbonen. En annen varierer bare litt fra den vanlige RGB-arkitekturen ved å akseptere 4-kanals input på et RGBI-format, og deretter behandle det akkurat som før. For det tredje er den siste variasjonen knyttet til preprosessering av bildedataene og postprosessering av resultatene produsert av nettverket. Det undersøkes om en modell som aksepterer flere mindre, flislagte bilder yter bedre enn en modell uten denne preprosesseringen. En algoritme er utviklet for å sammenligne resultatene ved å kombinere de mindre prediksjonene til fullbildeprediksjoner. Alle de mulige kombinasjonene av disse tre variantene er trent og testet på to separate datasett. Det er klare trender mot forbedret nøyaktighet ved bruk av IR-bilder, også noe forbedring ved bruk av flislagte bilder, selv om sistnevnte også drastisk øker prosesseringstiden. De mindre modellene presterte nesten like bra som de store, men det vises en liten økning i nøyaktigheten når de større modellene brukes. Ettersom alle modellene er i stand til å utføre nødvendig prosessering innenfor gitte tidsrammer, gitt riktig maskinvare, er alle de testede modellene brukbare i praktiske applikasjoner. Implementasjonen av et system som bruker disse prinsippene er ikke bare realistiske, men kan være en nødvendig vei mot et mer bærekraftig husdyrhold og landbrukspraksis i fremtiden
dc.description.abstract	Sheep farmers in Norway are in sore need of modernization. The farmers often use rangelands as grazing areas for their sheep, and during grazing season the farmer must conduct weekly inspections of their herd, which currently is a manual task. At season’s end, the sheep must be located and collected, and the farmer often needs assistance from their family, friends and neighbors for this. The weekly inspections and the localizing of sheep are time consuming tasks, but an alleviation for the farmer might come with the utilization of autonomous drones, combined with the automatic image processing capabilities of modern machine learning. This master thesis focuses on the machine learning aspects of a system solving this challenge. It will consider what changes can be made to the state-of-the-art object detection networks to further improve the network’s results. All testing is done using variations of the network architecture YOLOv5, applied to a dataset of relevant RGB and IR imagery of sheep, captured by a remote controlled UAV. Through rigorous examination, the thesis tests what effects three types of model variations will have on the results. Firstly, the model size, an editable parameter in the YOLOv5 framework, is varied to see to what degree the lowered computing demands of a smaller model will degrade the accuracy of the predictions made. Secondly, changes are made to the model architecture to include IR data. Two new variants are proposed: One variant applies a fusion based architecture which processes the RGB and the IR image in separate pipelines before fusing the intermediary data after the backbone. Another variant only varies slightly from the default RGB-only architecture, by accepting a 4 channel input in the format RGBI, and processing just as it would before. Thirdly, the final variation relates to the preprocessing of the image data and the postprocessing of the results produced by the network. It is examined whether a model accepting multiple smaller image tiles performs better than one without this tiling preprocessing, and a system is designed to combine the tiled predictions into full image predictions, to compare the results. All possible combinations of these three variations are trained and tested on two separate datasets. There are clear trends towards improved accuracies when using IR imagery, and similar, though less clear trends can be seen when using tiled images. It’s worth noting that the latter also drastically increases processing times. The smaller models almost performed as well as the large ones, but there is a slight increase in accuracy when the larger models are used. As all models are able to perform the necessary processing within the given time frames, given proper hardware, these models are all usable in practical applications. The implementation of a system using these principles is not only viable, but might be a necessary path towards more sustainable husbandry and agricultural practices in the future.
dc.language	eng
dc.publisher	NTNU
dc.title	Improved Sheep Detection - Modifying YOLOv5 to accurately detect grazing sheep in UAV imagery
dc.type	Master thesis

Tilhørende fil(er)

Filnavn:: no.ntnu:inspera:112046434:2644 ...
Størrelse:: 18.65Mb
Format:: PDF

Åpne

Denne innførselen finnes i følgende samling(er)

Institutt for datateknologi og informatikk [6620]

Vis enkel innførsel