A vectorized k-means algorithm for compressed datasets: design and experimental analysis

Al Hasib, Abdullah; Cebrian, Juan Manuel; Natvig, Lasse

dc.contributor.author	Al Hasib, Abdullah
dc.contributor.author	Cebrian, Juan Manuel
dc.contributor.author	Natvig, Lasse
dc.date.accessioned	2019-04-12T08:26:22Z
dc.date.available	2019-04-12T08:26:22Z
dc.date.created	2018-10-11T10:56:51Z
dc.date.issued	2018
dc.identifier.citation	Journal of Supercomputing. 2018, 74 (6), 2705-2728.	nb_NO
dc.identifier.issn	0920-8542
dc.identifier.uri	http://hdl.handle.net/11250/2594414
dc.description.abstract	Clustering algorithms (i.e., Gaussian mixture models, k-means) tackle the problem of grouping a set of elements in such a way that elements from the same group (or cluster) have more similar properties to each other than to those elements in other clusters. This simple concept turns out to be the basis in complex algorithms from many application areas, including sequence analysis and genotyping in bioinformatics, medical imaging, antimicrobial activity, market research, social networking, etc. However, as the data volume continues to increase, the performance of clustering algorithms is heavily influenced by the memory subsystem. In this paper, we propose a novel and efficient implementation of Lloyd’s k-means clustering algorithm to substantially reduce data movement along the memory hierarchy. Our contributions are based on the fact that the vast majority of processors are equipped with powerful Single Instruction Multiple Data (SIMD) instructions that are, in most cases, underused. SIMD improves the CPU computational power and, if used wisely, can be seen as an opportunity to improve on the application data transfers by compressing/decompressing the data, specially for memory-bound applications. Our contributions include a SIMD-friendly data layout organization, in-register implementation of key functions and SIMD-based compression. We demonstrate that using our optimized SIMD-based compression method, it is possible to improve the performance and energy of k-means by a factor of 4.5x and 8.7x, respectively, for a i7 Haswell machine, and 22x and 22.2x for Xeon Phi: KNL, running a single thread.	nb_NO
dc.language.iso	eng	nb_NO
dc.publisher	Springer Verlag	nb_NO
dc.title	A vectorized k-means algorithm for compressed datasets: design and experimental analysis	nb_NO
dc.type	Journal article	nb_NO
dc.type	Peer reviewed	nb_NO
dc.description.version	acceptedVersion	nb_NO
dc.source.pagenumber	2705-2728	nb_NO
dc.source.volume	74	nb_NO
dc.source.journal	Journal of Supercomputing	nb_NO
dc.source.issue	6	nb_NO
dc.identifier.doi	10.1007/s11227-018-2310-0
dc.identifier.cristin	1619591
dc.description.localcode	This is a post-peer-review, pre-copyedit version of an article published in [Journal of Supercomputing]. The final authenticated version is available online at: https://doi.org/10.1007/s11227-018-2310-0	nb_NO
cristin.unitcode	194,63,10,0
cristin.unitname	Institutt for datateknologi og informatikk
cristin.ispublished	true
cristin.fulltext	postprint
cristin.qualitycode	1

Tilhørende fil(er)

Filnavn:: k-means-super.pdf
Størrelse:: 752.7Kb
Format:: PDF
Beskrivelse:: Al Hasib

Åpne

Denne innførselen finnes i følgende samling(er)

Institutt for datateknologi og informatikk [6620]
Publikasjoner fra CRIStin - NTNU [37703]

Vis enkel innførsel