DTP: Enabling Exhaustive Exploration of FPGA Temporal Partitions for Streaming HPC Applications
Chapter, Peer reviewed
Accepted version
View/ Open
Date
2017Metadata
Show full item recordCollections
Abstract
Reconfigurable computing systems show great promise for accelerating streaming HPC applications because of their low power consumption and high performance. However, mapping an HPC application to a reconfigurable system is a challenging task. The challenge is exacerbated by the need to temporally partition computational kernels when application requirements exceed resource availability. In this paper, we propose a novel design methodology that we call Dataflow Temporal Partitioning (DTP). The key insight in the design of DTP was that the application should be represented as a high-level data flow graph where each node is a computational kernel and the edges represent inter-node data flow. DTP also supports parallel instantiation of kernels and multiple kernel implementations at different performance/area design points. In contrast to previous proposals, DTP is able to exhaustively explore the solution space for practical applications. Our evaluation of DTP shows that it is able to identify candidate implementations that outperform both previously proposed partitioning heuristics and a direct mapping to the synthesis tool. The temporal configuration selected by DTP can outperform the direct mapping by up to 3X.