Throughput Computing on Future GPUs

Hovland, Rune Johan

dc.contributor.advisor	Elster, Anne Cathrine	nb_NO
dc.contributor.advisor	Hetland, Magnus Lie	nb_NO
dc.contributor.author	Hovland, Rune Johan	nb_NO
dc.date.accessioned	2014-12-19T13:34:01Z
dc.date.available	2014-12-19T13:34:01Z
dc.date.created	2010-09-04	nb_NO
dc.date.issued	2009	nb_NO
dc.identifier	348820	nb_NO
dc.identifier	ntnudaim:4669	nb_NO
dc.identifier.uri	http://hdl.handle.net/11250/251376
dc.description.abstract	The general-purpose computing capabilities of the Graphics Processing Unit (GPU) have recently been given a great deal of attention by the High-Performance Computing (HPC) community. By allowing massively parallel applications to run efficiently on commodity graphics cards, personal supercomputers are now available in desktop versions at a low price. For some applications, speedups of 70 times that of a single CPU implementation have been achieved. Among the most popular GPUs are those based on the NVIDIA Tesla Architecture which allows relatively easy development of GPU applications using the NVIDIA CUDA programming environment. While the GPU is gaining interest in the HPC community, others are more reluctant to embrace the GPU as a computational device. The focus on throughput and large data volumes separates Information Retrieval (IR) from HPC, since for IR it is critical to process large amounts of data efficiently, a task which the GPU currently does not excel at. Only recently has the IR community begun to explore the possibilities, and an implementation of a search engine for the GPU was published recently in April 2009. This thesis analyzes how GPUs can be improved to better suit large data volume applications. Current graphics cards have a bottleneck regarding the transfer of data between the host and the GPU. One approach to resolve this bottleneck is to include the host memory as part of the GPUs memory hierarchy. We develop a theoretical model, and based on this model, the expected performance improvement for high data volume applications are shown for both computationally-bound and data transfer-bound applications. The performance improvement for an existing search engine is also given based on the theoretical model. For this case, the improvements would result in a speedup between 1.389 and 1.874 for the various query-types supported by the search engine.	nb_NO
dc.language	eng	nb_NO
dc.publisher	Institutt for datateknikk og informasjonsvitenskap	nb_NO
dc.subject	ntnudaim	no_NO
dc.subject	SIF2 datateknikk	no_NO
dc.subject	Komplekse datasystemer	no_NO
dc.title	Throughput Computing on Future GPUs	nb_NO
dc.type	Master thesis	nb_NO
dc.source.pagenumber	92	nb_NO
dc.contributor.department	Norges teknisk-naturvitenskapelige universitet, Fakultet for informasjonsteknologi, matematikk og elektroteknikk, Institutt for datateknikk og informasjonsvitenskap	nb_NO

Tilhørende fil(er)

Filnavn:: 348820_COVER01.pdf
Størrelse:: 46.41Kb
Format:: PDF

Åpne

Filnavn:: 348820_FULLTEXT01.pdf
Størrelse:: 1.807Mb
Format:: PDF

Åpne

Denne innførselen finnes i følgende samling(er)

Institutt for datateknologi og informatikk [6552]

Vis enkel innførsel