Improving the Performance of Processor Core Simulation in the M5 Simulator
Abstract
Simulators are often used to evaluate new ideas in computer architecture research. Unfortunately, detailed simulation is computationally expensive, leading to long simulation turn-around times. This is particularly true when simulating chip multi-processors (CMPs), as it requires simulation of multiple cores. Much effort has therefore been put into developing techniques capable of accelerating simulation. At the NTNU Computer Architecture Research Group (NCAR), research is focused on how the performance of memory systems in CMPs can be improved. As such, it is permissible to simulate the processor in less detail in order to speed up simulation. In this work, Gabriel Loh's time-stamping approach to improving simulator efficiency is investigated. Here, the central idea is to speed up simulation by avoiding detailed and costly simulation of pipeline structures at every clock cycle. Instead, emph{time-stamps} are associated with every processor resource. Instruction throughput can be estimated by considering all time-stamps involved in the execution of an instruciton. A series of time-stamp rules update the time-stamps as the simulation progresses. The feasibility of implementing Loh's time-stamping approach in the M5 simulator is evaluated. An extension necessary for time-stamping to be applied in M5 is developed. Furthermore, new time-stamping rules capable of simulating out-of-order cache accesses and non-pipelined functional units are developed. As part of this work, a new processor model based on the above extensions has also been implemented.