Vis enkel innførsel

dc.contributor.authorAl Hasib, Abdullah
dc.contributor.authorCebrian, Juan Manuel
dc.contributor.authorNatvig, Lasse
dc.date.accessioned2019-04-12T08:26:22Z
dc.date.available2019-04-12T08:26:22Z
dc.date.created2018-10-11T10:56:51Z
dc.date.issued2018
dc.identifier.citationJournal of Supercomputing. 2018, 74 (6), 2705-2728.nb_NO
dc.identifier.issn0920-8542
dc.identifier.urihttp://hdl.handle.net/11250/2594414
dc.description.abstractClustering algorithms (i.e., Gaussian mixture models, k-means) tackle the problem of grouping a set of elements in such a way that elements from the same group (or cluster) have more similar properties to each other than to those elements in other clusters. This simple concept turns out to be the basis in complex algorithms from many application areas, including sequence analysis and genotyping in bioinformatics, medical imaging, antimicrobial activity, market research, social networking, etc. However, as the data volume continues to increase, the performance of clustering algorithms is heavily influenced by the memory subsystem. In this paper, we propose a novel and efficient implementation of Lloyd’s k-means clustering algorithm to substantially reduce data movement along the memory hierarchy. Our contributions are based on the fact that the vast majority of processors are equipped with powerful Single Instruction Multiple Data (SIMD) instructions that are, in most cases, underused. SIMD improves the CPU computational power and, if used wisely, can be seen as an opportunity to improve on the application data transfers by compressing/decompressing the data, specially for memory-bound applications. Our contributions include a SIMD-friendly data layout organization, in-register implementation of key functions and SIMD-based compression. We demonstrate that using our optimized SIMD-based compression method, it is possible to improve the performance and energy of k-means by a factor of 4.5x and 8.7x, respectively, for a i7 Haswell machine, and 22x and 22.2x for Xeon Phi: KNL, running a single thread.nb_NO
dc.language.isoengnb_NO
dc.publisherSpringer Verlagnb_NO
dc.titleA vectorized k-means algorithm for compressed datasets: design and experimental analysisnb_NO
dc.typeJournal articlenb_NO
dc.typePeer reviewednb_NO
dc.description.versionacceptedVersionnb_NO
dc.source.pagenumber2705-2728nb_NO
dc.source.volume74nb_NO
dc.source.journalJournal of Supercomputingnb_NO
dc.source.issue6nb_NO
dc.identifier.doi10.1007/s11227-018-2310-0
dc.identifier.cristin1619591
dc.description.localcodeThis is a post-peer-review, pre-copyedit version of an article published in [Journal of Supercomputing]. The final authenticated version is available online at: https://doi.org/10.1007/s11227-018-2310-0nb_NO
cristin.unitcode194,63,10,0
cristin.unitnameInstitutt for datateknologi og informatikk
cristin.ispublishedtrue
cristin.fulltextpostprint
cristin.qualitycode1


Tilhørende fil(er)

Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel