Vis enkel innførsel

dc.contributor.advisorNatvig, Lassenb_NO
dc.contributor.authorLillesand, Trond Ingenb_NO
dc.date.accessioned2014-12-19T13:40:17Z
dc.date.available2014-12-19T13:40:17Z
dc.date.created2013-10-12nb_NO
dc.date.issued2013nb_NO
dc.identifier655637nb_NO
dc.identifierntnudaim:9105nb_NO
dc.identifier.urihttp://hdl.handle.net/11250/253392
dc.description.abstractIn this thesis, the application kernels 2D-Convolution and Merge Sort are implemented in OmpSs, NEON and OpenCL on an Arndale development board containing an Exynos 5 SoC. The SoC contains an ARM Cortex A15 dual processor and a Mali T604 GPU. A scheme for measuring whole-board energy consumption is then created, where performance and energy efficiency metrics are used to evaluate the various implementations. The frequency is also scaled for the different CPU implementations to see how different frequencies affect these metrics. NEON vectorization is exploited by using vector extractions on the 2D-Convolution kernel to improve locality. For Merge Sort, NEON is exploited by performing in-register sorting with a bitonic sorting network. With OpenCL, a bitonic and odd-even sorting network is used. Different scheduling policies in the OmpSs implementations are used to find the best performing policy. Vectorization with NEON gives the highest performance on both applications, and highest energy efficiency for Merge Sort. Vectorization with NEON result in high performance at the expense of high power consumption. The OpenCL implementation for 2D-Convolution gave a high performance and low power consumption, and achieved the highest energy efficiency. For the OmpSs implementations, the choice of scheduling policy proved to affect performance. Scaling the frequency on the applications shows that there is a balance point between frequency and energy efficiency, where a too high frequency on two cores result in a larger increase in power than performance, and a too low frequency result in a larger decrease in performance than power. The results indicate that this difference increases with the amount of cores.nb_NO
dc.languageengnb_NO
dc.publisherInstitutt for datateknikk og informasjonsvitenskapnb_NO
dc.titleAcceleration with OmpSs and Neon/OpenCL on ARM Processornb_NO
dc.title.alternativeAcceleration with OmpSs and Neon/OpenCL on ARM Processornb_NO
dc.typeMaster thesisnb_NO
dc.source.pagenumber160nb_NO
dc.contributor.departmentNorges teknisk-naturvitenskapelige universitet, Fakultet for informasjonsteknologi, matematikk og elektroteknikk, Institutt for datateknikk og informasjonsvitenskapnb_NO


Tilhørende fil(er)

Thumbnail
Thumbnail
Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel