SeTHet - Sending Tuned numbers over DMA onto Heterogeneous clusters: an automated precision tuning story
Magnani, Gabriele; Cattaneo, Daniele; Denisov, Lev; Tagliavini, Giuseppe; Agosta, Giovanni; Cherubin, Stefano
Original version
Proceedings of the 21st ACM International Conference on Computing Frontiers. 2024, 258-266. 10.1145/3649153.3649203Abstract
Energy and performance optimization of embedded hardware and software is of critical importance to achieve the overall system goals. In this work, we study the optimization of memory access through a combination of hardware (Direct Memory Access, DMA) and software (Precision Tuning) techniques, and we propose a compiler toolchain for managing both in the context of heterogeneous RISC-Vbased platforms. Our proposed toolchain, SeTHet, enables 3 --- 48 × speedup over the baseline system when employing both DMA and precision tuning, regardless of the availability of floating point units in hardware. SeTHet also achieves up to 16× speedup compared to DMA alone, thus proving that the combination of the two techniques provides a major improvement over either technique employed in isolation.