|dc.description.abstract||In recent years, CPU performance has become energy constrained. If performance is to continue increasing, new methods for creating more energy efficient CPUs will have to be explored. Current computing systems use complex CPUs that interface to the main memory through a hierarchy of caches. These performance-centric design use a lot of power and chip-area to minimize the gap between CPU and main memory speeds. Caches contribute much of a systems's energy consumption. Conventional set-associative level-one data caches (L1 DCs) are performance-critical and are therefore optimized for speed. The access latency is optimized by accessing all ways in parallel for load operations. However, this results in a significant amount of wasted energy, since only data from one way is used. To reduce energy, numerous cache architectures, such as way-prediction, way-shutdown and highly-associative have been proposed. However, these optimizations in many cases increase latency and complexity, which makes them unattractive for L1 caches.
This thesis cover the implementation and evaluation of a combination of techniques that enables access to only the way where the data resides. The first technique works by halting cache ways that cannot possobly contain the requested data. The second technique works by sequentially accessing tag and data ways when there is no data dependency with a subsequent instruction. These techniques have been implemented in the SHMAC framework, and benchmarked with a subset of MiBench applications.||