Vectorized Benchmarks for the Berkeley Dwarfs
MetadataShow full item record
In order to guide development of new hardware that meet ever increasing needs, researchers and system designers need high quality performance evaluation tools. In computer science, benchmarking has emerged as one of the most important methods for this purpose. Multiple benchmarks that collectively evaluate a system for a wide range of characteristics in a specific area of interest are compiled into benchmark suites. The purpose is to increase the chances that insight of high validity can be leveraged, i.e. the insight is of high enough accuracy to be applied to real computer systems. The Berkeley dwarfs taxonomy, which are 13 computational patterns in widespread use in the fields of science and engineering, can be used for this purpose. From the start of the 21st century, as conventional instruction-level parallelism has failed to provide further microprocessor performance increases, the industry has been looking for ways to exploit other types of parallelism as well. One of these is data-level parallelism, found in the single input multiple data (SIMD) computer organization. Research shows that using vectorization can offer more benefits, e.g. increased energy efficiency. However, while SIMD is seeing increased adoption today, Cebrian et al. noticed that SIMD-aware benchmarking tools are not as widely available, which they argue can cause SIMD designers to under/over estimate the impact of their contributions. For this reason, they proposed SIMDwarfs, aiming to offer the research community with a SIMD-aware benchmark suite covering all 13 Berkeley dwarfs.In this thesis we have contributed to SIMDwarfs by analyzing four retrieved, vectorized benchmark implementations from three uncovered dwarfs: nbody from n-body methods, nqueens from backtrack and branch-and-bound, and NW and SWat from dynamic programming. All implementations were evaluated using non-vectorized configurations and configurations utilizing SSE and AVX SIMD extensions. The results indicated that while vectorization offered improved performance, the hardware used for the evaluations limited further performance increases. With these implementations added, SIMDwarfs now cover 10 of 13 dwarfs.