dc.contributor.author | Koraei, Mostafa | |
dc.contributor.author | Fatemi, Omid | |
dc.contributor.author | Jahre, Magnus | |
dc.date.accessioned | 2019-11-06T07:38:33Z | |
dc.date.available | 2019-11-06T07:38:33Z | |
dc.date.created | 2019-10-14T15:11:17Z | |
dc.date.issued | 2019 | |
dc.identifier.issn | 1544-3566 | |
dc.identifier.uri | http://hdl.handle.net/11250/2626772 | |
dc.description.abstract | Iterative Stencil Loops (ISLs) are the key kernel within a range of compute-intensive applications. To accelerate ISLs with Field Programmable Gate Arrays, it is critical to exploit parallelism (1) among elements within the same iteration and (2) across loop iterations. We propose a novel ISL acceleration scheme called Direct Computation of Multiple Iterations (DCMI) that improves upon prior work by pre-computing the effective stencil coefficients after a number of iterations at design time—resulting in accelerators that use minimal on-chip memory and avoid redundant computation. This enables DCMI to improve throughput by up to 7.7× compared to the state-of-the-art cone-based architecture. | nb_NO |
dc.language.iso | eng | nb_NO |
dc.publisher | Association for Computing Machinery (ACM) | nb_NO |
dc.title | DCMI: A Scalable Strategy for Accelerating Iterative Stencil Loops on FPGAs | nb_NO |
dc.type | Journal article | nb_NO |
dc.type | Peer reviewed | nb_NO |
dc.description.version | publishedVersion | nb_NO |
dc.source.volume | 16 | nb_NO |
dc.source.journal | ACM Transactions on Architecture and Code Optimization (TACO) | nb_NO |
dc.source.issue | 4 | nb_NO |
dc.identifier.doi | 10.1145/3352813 | |
dc.identifier.cristin | 1736938 | |
dc.relation.project | EC/H2020/688403 | nb_NO |
dc.description.localcode | This article will not be available due to copyright restrictions (c) 2019 by ACM | nb_NO |
cristin.unitcode | 194,63,10,0 | |
cristin.unitname | Institutt for datateknologi og informatikk | |
cristin.ispublished | true | |
cristin.fulltext | postprint | |
cristin.qualitycode | 2 | |