• norsk
    • English
  • English 
    • norsk
    • English
  • Login
View Item 
  •   Home
  • Fakultet for informasjonsteknologi og elektroteknikk (IE)
  • Institutt for datateknologi og informatikk
  • View Item
  •   Home
  • Fakultet for informasjonsteknologi og elektroteknikk (IE)
  • Institutt for datateknologi og informatikk
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Energy Efficiency Studies of Mont Blanc Applications

Holden, Mads
Master thesis
Thumbnail
View/Open
655602_ATTACHMENT01.zip (30.45Kb)
655602_COVER01.pdf (184.1Kb)
655602_FULLTEXT01.pdf (1.503Mb)
URI
http://hdl.handle.net/11250/253268
Date
2013
Metadata
Show full item record
Collections
  • Institutt for datateknologi og informatikk [3776]
Abstract
In this thesis, the performance and energy efficiency of four different implementations of matrix multiplication, written in OmpSs and OpenCL, is tested and evaluated. The benchmarking is done using an Intel Ivy Bridge Core i7 3770K. The results are evaluated and discussed with regards to different optimization configurations, like vectorization and multi-threading. Energy measurements are taken using PAPI, which in turn uses the Running Average Power Limit interface in the Intel processor to take energy readings. Performance is presented using MFLOPS, while energy efficiency is compared using MFLOPS/W, watts used, and the energy delay product and energy delay squared. The OpenCL versions are compared with and without vectorization. One of the applications using OmpSs is also measured with regards to vectorization, and also number of threads. The last OmpSs version uses the BLAS implementation ATLAS, which is already vectorized. Therefore it is only compared using number of threads. SSE and AVX vectorization is shown to significantly improve performance while using little to no extra energy per second for all implementations. Multi-threading also gives higher performance, however this consumes more energy. Running with eight threads was shown to spend more energy while performing worse when using ATLAS. The OmpSs version using ATLAS was both the fastest and most energy efficient, peaking at 125 GFLOPS and 2.7 GLOPS/W while running with four threads and using AVX.
Publisher
Institutt for datateknikk og informasjonsvitenskap

Contact Us | Send Feedback

Privacy policy
DSpace software copyright © 2002-2019  DuraSpace

Service from  Unit
 

 

Browse

ArchiveCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsDocument TypesJournalsThis CollectionBy Issue DateAuthorsTitlesSubjectsDocument TypesJournals

My Account

Login

Statistics

View Usage Statistics

Contact Us | Send Feedback

Privacy policy
DSpace software copyright © 2002-2019  DuraSpace

Service from  Unit