Show simple item record

dc.contributor.authorWang, Xinliang
dc.contributor.authorLiu, Weifeng
dc.contributor.authorXue, Wei
dc.contributor.authorWu, Li
dc.date.accessioned2018-04-09T08:14:57Z
dc.date.available2018-04-09T08:14:57Z
dc.date.created2018-04-06T09:21:44Z
dc.date.issued2018
dc.identifier.citationACMSIGPLAN Symposium on Principles and Practice of Parallel Programming. 2018, .nb_NO
dc.identifier.issn1542-0205
dc.identifier.urihttp://hdl.handle.net/11250/2493147
dc.description.abstractSparse triangular solve (SpTRSV) is one of the most important kernels in many real-world applications. Currently, much research on parallel SpTRSV focuses on level-set construction for reducing the number of inter-level synchronizations. However, the out-of-control data reuse and high cost for global memory or shared cache access in inter-level synchronization have been largely neglected in existing work. In this paper, we propose a novel data layout called Sparse Level Tile to make all data reuse under control, and design a Producer-Consumer pairing method to make any inter-level synchronization only happen in very fast register communication. We implement our data layout and algorithms on an SW26010 many-core processor, which is the main building-block of the current world fastest supercomputer Sunway Taihulight. The experimental results of testing all 2057 square matrices from the Florida Matrix Collection show that our method achieves an average speedup of 6.9 and the best speedup of 38.5 over parallel level-set method. Our method also outperforms the latest methods on a KNC many-core processor in 1856 matrices and the latest methods on a K80 GPU in 1672 matrices, respectively.nb_NO
dc.language.isoengnb_NO
dc.publisherAssociation for Computing Machinery (ACM)nb_NO
dc.titleswSpTRSV: A Fast Sparse Triangular Solve with Sparse Level Tile Layout on Sunway Architecturesnb_NO
dc.typeJournal articlenb_NO
dc.typePeer reviewednb_NO
dc.description.versionpublishedVersionnb_NO
dc.source.pagenumber16nb_NO
dc.source.journalACMSIGPLAN Symposium on Principles and Practice of Parallel Programmingnb_NO
dc.identifier.doihttps://doi.org/10.1145/3178487.3178513
dc.identifier.cristin1577866
dc.description.localcode© 2018 Copyright held by the owner/author(s). Publication rights licensed to the Association for Computing Machinery.nb_NO
cristin.unitcode194,63,10,0
cristin.unitnameInstitutt for datateknologi og informatikk
cristin.ispublishedtrue
cristin.fulltextoriginal
cristin.qualitycode1


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record