Komparativ ytelsesmodellering av en WCSPH proxy-applikasjon på GPU og CPU arkitekturer

Kaldager, Andreas Kolstø

dc.contributor.advisor	Meyer, Jan Christian
dc.contributor.author	Kaldager, Andreas Kolstø
dc.date.accessioned	2019-10-31T15:17:25Z
dc.date.available	2019-10-31T15:17:25Z
dc.date.issued	2019
dc.identifier.uri	http://hdl.handle.net/11250/2625845
dc.description.abstract	I denne avhandlingen gransker vi ytelsen til en WCSPH proxy-applikasjon anvendt på demningsbrist-problemet når beregningsdelen er avlastet til GPUen. Vi implementerer en naiv og en sofistikert tilnærming til problemets flaskehals, som er å finne nabo-partikler. Vi lager ytelsesmodeller for kommunikasjon, beregning og de underliggende delene. Ved å anvende disse modellene med empiriske data undersøker vi skalerbarhetskarakteristikkene til applikasjonen, Speedup og Efficiency verdiene av avlastning, innvirkning av flaskehalsen på applikasjonen, utvikler et nytteintervall for avlastning, analyserer feilen ved de anvendte estimatene og utvikler en formel for den største problemstørrelsen for en gitt GPU sitt minne. Modellene blir eksperimentelt validert med empiriske data. Vi konkluderer med at avlastning senker innflytelsen til flaskehalsen og oppnår en Speedup verdi på 10, men senker Efficiency. Vi konkluderer også med at kommunikasjonsmengden vil vokse fra beregningsmengden gitt en stor nok problemstørrelse, at de anvendte estimatene har under 20% i feilmargin og at applikasjonen skalerer bra uten ulemper for ytelsen ved horisontal skalering.
dc.description.abstract	In this thesis, we investigate the performance of a WCSPH proxy application applied to the dam-break problem when offloading the computational part to the GPU. We implement one naive and one sophisticated approach for the bottleneck of the problem, which is finding neighboring particles. We create performance models for communication, computation, and the parts therein. Using these models and empirical data, we investigate the scalability characteristics of the application, the Speedup and Efficiency of the offloading, the impact of the bottleneck on the application, devise a utility range for offloading the problem, analyse the errors of the estimates used, and create a formula for the maximum problem size given a GPU's memory. The models are experimentally validated with the empirical data. We conclude that offloading lowers the impact of the bottleneck and achieves a Speedup of 10, but lowers Efficiency. We also conclude that the communication amount will outgrow the computation amount given a sufficiently large problem size, the estimates used are below 20% in error, and that the application scales well with no performance drawbacks from horizontal scaling.
dc.language	eng
dc.publisher	NTNU
dc.title	Komparativ ytelsesmodellering av en WCSPH proxy-applikasjon på GPU og CPU arkitekturer
dc.type	Master thesis

Files in this item

Name:: no.ntnu:inspera:2524681.pdf
Size:: 6.922Mb
Format:: PDF

View/Open

Name:: no.ntnu:inspera:2524681.zip
Size:: 57.78Kb
Format:: application/zip

View/Open

This item appears in the following Collection(s)

Institutt for datateknologi og informatikk [6544]

Show simple item record