Improving Energy Efficiency with Special-Purpose Accelerators
MetadataShow full item record
The number of transistors per chip and their speed grows exponentially, but thepower dissipation per transistor is decreased slightly with each processgeneration. This leads to increased power density and heat generation, meaningthat only a fraction of the chip can be active at any given time. To attack thisproblem, heterogeneous systems-on-chip are developed. They consist of multiplespecialized cores, each optimized to perform a particular set of tasks.Delegating parts of the application to run on specific,energy-efficient cores, allows more computations to execute within the givenpower budget, increasing the overall performance of the system.This thesis proposes a methodology for developing a special-purpose acceleratorfor a given application to create an energy-efficient heterogeneoussystem-on-chip based on the Xilinx Zynq platform. This work introduces theXilinx tool suite used during development and defines the complete designwork flow for implementing the accelerator and running the application onthe accelerated system. This work evaluates the optimization techniqueswhich lead to the most energy-efficient implementation. The simulations showthat pipelining, separate ports for reading and writing data and a small,fast, local memory improves the performance of the accelerator by a factorof 44.4x and the energy-efficiency by 379x.The accelerator is physically implemented on the Xilinx Zynq SoC and acts as aco-processor for the ARM CPU available on the system. This work proposes amethodology for evaluating the physical power consumption and performance of variousconfigurations of the system. For the given application, the system with theaccelerator running at 125 MHz is 1.5x faster and 2.15x more energy-efficientcompared to the application executing only on the CPU at 666 MHz. If the clockfrequencies are matched at 100 MHz, the accelerated system is 3.6x faster and 3xmore energy-efficient.