Auto-tunable GPU BLAS

Lien, Geir Josten

dc.contributor.advisor	Elster, Anne Cathrine	nb_NO
dc.contributor.author	Lien, Geir Josten	nb_NO
dc.date.accessioned	2014-12-19T13:38:40Z
dc.date.available	2014-12-19T13:38:40Z
dc.date.created	2012-11-08	nb_NO
dc.date.issued	2012	nb_NO
dc.identifier	565905	nb_NO
dc.identifier	ntnudaim:5976	nb_NO
dc.identifier.uri	http://hdl.handle.net/11250/252910
dc.description.abstract	In this paper, we present our implementation of an Auto tuning system, written in C++, which incorporate the use of OpenCL kernels. We deploy this approach on different GPU architectures, evaluating the performance of the approach. Our main focus is to easily generate tuned code, that would otherwise require a large amount of empirical testing, and then run it on any kind of device. This is achieved through the auto tuning framework, which will create different kernels, compile and run them on the device and output the best performing kernel on the given platform.BLAS is much used in performance critical applications, and is a good candidate for execution on GPUs due to its potential performance increase. Our implementation was benchmarked on various of test environments, with different GPUs, where we achieved comparable results to the ViennaCL library. We also tested against the native vendor specific BLAS libraries from AMD and NVIDIA.	nb_NO
dc.language	eng	nb_NO
dc.publisher	Institutt for datateknikk og informasjonsvitenskap	nb_NO
dc.subject	ntnudaim:5976	no_NO
dc.subject	MIT informatikk	no_NO
dc.subject	Komplekse datasystemer	no_NO
dc.title	Auto-tunable GPU BLAS	nb_NO
dc.type	Master thesis	nb_NO
dc.source.pagenumber	63	nb_NO
dc.contributor.department	Norges teknisk-naturvitenskapelige universitet, Fakultet for informasjonsteknologi, matematikk og elektroteknikk, Institutt for datateknikk og informasjonsvitenskap	nb_NO

Tilhørende fil(er)

Filnavn:: 565905_ATTACHMENT01.zip
Størrelse:: 12.24Mb
Format:: Ukjent

Åpne

Filnavn:: 565905_COVER01.pdf
Størrelse:: 184.2Kb
Format:: PDF

Åpne

Filnavn:: 565905_FULLTEXT01.pdf
Størrelse:: 1.215Mb
Format:: PDF

Åpne

Denne innførselen finnes i følgende samling(er)

Institutt for datateknologi og informatikk [6778]

Vis enkel innførsel