Evaluation of Cache Management Algorithms for Shared Last Level Caches
Master thesis
Permanent lenke
http://hdl.handle.net/11250/2353585Utgivelsesdato
2015Metadata
Vis full innførselSamlinger
Sammendrag
The performance gap between processors and main memory has been growing overthe last decades. Fast memory structures know as caches were introduced to mitigatesome of the effects of this gap. After processor manufacturers reached thelimits of single core processors performance in the early 2000s, multicore processorshave become common. Multicore processors commonly share cache space betweencores, and algorithms that manage access to shared cache structures have becomean important research topic. Many researchers have presented algorithms that aresupposed to improve the performance of multicore processors by modifying cachepolicies. In this thesis, we present and evaluate several recent and important worksin the cache management eld. We present a simulation framework for evaluationof various cache management algorithms, based on the Sniper simulation system.Several of the presented algorithms are implemented; Thread Aware Dynamic InsertionPolicy (TADIP), Dynamic Re-Reference Interval Prediction (DRRIP), UtilityCache Partition (UCP), Promotion/Insertion Pseduo-Partitioning (PIPP), andProbabilistic Shared Cache Management (PriSM). The implemented algorithms areevaluated against the commonly used Least Recently Used (LRU) replacement policyand each other. In addition, we perform ve sensitivity analysis experiments,exploring algorithm sensitivity to changes the simulated architecture. In total datafrom almost 9000 simulation runs is used in our evaluation.
Our results suggest that all implemented algorithms mostly perform as goodas or better than LRU in 4-core architectures. In 8- and 16-core architecturessome of the algorithms, especially PIPP, perform worse than LRU. Throughout allour experiments UCP, the oldest of the evaluated alternative to LRU, is the bestperformer with an average performance increase of about 5%. We also show thatUCP performance increases to more than 20% when available cache and memoryresources are reduced.