• norsk
    • English
  • English 
    • norsk
    • English
  • Login
View Item 
  •   Home
  • Fakultet for informasjonsteknologi og elektroteknikk (IE)
  • Institutt for datateknologi og informatikk
  • View Item
  •   Home
  • Fakultet for informasjonsteknologi og elektroteknikk (IE)
  • Institutt for datateknologi og informatikk
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Acceleration with OmpSs and Neon/OpenCL on ARM Processor

Lillesand, Trond Inge
Master thesis
View/Open
655637_COVER01.pdf (Locked)
655637_ATTACHMENT01.zip (Locked)
655637_FULLTEXT01.pdf (Locked)
URI
http://hdl.handle.net/11250/253392
Date
2013
Metadata
Show full item record
Collections
  • Institutt for datateknologi og informatikk [3771]
Abstract
In this thesis, the application kernels 2D-Convolution and Merge Sort are implemented in OmpSs, NEON and OpenCL on an Arndale development board containing an Exynos 5 SoC. The SoC contains an ARM Cortex A15 dual processor and a Mali T604 GPU. A scheme for measuring whole-board energy consumption is then created, where performance and energy efficiency metrics are used to evaluate the various implementations. The frequency is also scaled for the different CPU implementations to see how different frequencies affect these metrics. NEON vectorization is exploited by using vector extractions on the 2D-Convolution kernel to improve locality. For Merge Sort, NEON is exploited by performing in-register sorting with a bitonic sorting network. With OpenCL, a bitonic and odd-even sorting network is used. Different scheduling policies in the OmpSs implementations are used to find the best performing policy. Vectorization with NEON gives the highest performance on both applications, and highest energy efficiency for Merge Sort. Vectorization with NEON result in high performance at the expense of high power consumption. The OpenCL implementation for 2D-Convolution gave a high performance and low power consumption, and achieved the highest energy efficiency. For the OmpSs implementations, the choice of scheduling policy proved to affect performance. Scaling the frequency on the applications shows that there is a balance point between frequency and energy efficiency, where a too high frequency on two cores result in a larger increase in power than performance, and a too low frequency result in a larger decrease in performance than power. The results indicate that this difference increases with the amount of cores.
Publisher
Institutt for datateknikk og informasjonsvitenskap

Contact Us | Send Feedback

Privacy policy
DSpace software copyright © 2002-2019  DuraSpace

Service from  Unit
 

 

Browse

ArchiveCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsDocument TypesJournalsThis CollectionBy Issue DateAuthorsTitlesSubjectsDocument TypesJournals

My Account

Login

Statistics

View Usage Statistics

Contact Us | Send Feedback

Privacy policy
DSpace software copyright © 2002-2019  DuraSpace

Service from  Unit