Task Based Parallel Programming on the SHMAC Multi-Core Prototype
Abstract
In this thesis, different task based parallel programming implementations are evaluated for use on the tile-based Single-ISA Heterogeneous MAny-core Computer (SHMAC). The OpenMP API is chosen as the preferred parallel programming model due to its simple, standardized and portable way of expressing parallelism. The OMPi OpenMP implementation is ported to the SHMAC, and its task based programming capabilities are verified by running a subset of applications from the BOTS benchmark suite.
A heterogeneous extension to the OpenMP API is implemented within OMPi. The \verb+core()+ clause can be appended to the \verb+task+ directive to specify which type of processing core within the SHMAC platform is most suited to execute the created task.
To verify the effectiveness of this new \verb+core()+ clause, a new processing element containing a floating point unit is implemented for the SHMAC. With this new processing element in mind, two heterogeneous workloads benefiting from increased floating point performance on specific tasks are developed. The two benchmarks are run with and without the \verb+core()+ clause, measuring the difference in performance.
For both workloads, use of core clause is shown to reduce the runtime. Resulting in a speedup of 50\% and 17\% respectively.