Vis enkel innførsel

dc.contributor.authorGottschall, Björn
dc.contributor.authorEeckhout, Lieven
dc.contributor.authorJahre, Magnus
dc.date.accessioned2023-08-16T09:09:57Z
dc.date.available2023-08-16T09:09:57Z
dc.date.created2023-06-30T09:22:58Z
dc.date.issued2023
dc.identifier.isbn979-8-4007-0095-8
dc.identifier.urihttps://hdl.handle.net/11250/3084348
dc.description.abstractAs computer architectures become increasingly complex and heterogeneous, it becomes progressively more difficult to write applications that make good use of hardware resources. Performance analysis tools are hence critically important as they are the only way through which developers can gain insight into the reasons why their application performs as it does. State-of-the-art performance analysis tools capture a plethora of performance events and are practically non-intrusive, but performance optimization is still extremely challenging. We believe that the fundamental reason is that current state-of-the-art tools in general cannot explain why executing the application's performance-critical instructions take time. We hence propose Time-Proportional Event Analysis (TEA) which explains why the architecture spends time executing the application's performance-critical instructions by creating time-proportional Per-Instruction Cycle Stacks (PICS). PICS unify performance profiling and performance event analysis, and thereby (i) report the contribution of each static instruction to overall execution time, and (ii) break down per-instruction execution time across the (combinations of) performance events that a static instruction was subjected to across its dynamic executions. Creating time-proportional PICS requires tracking performance events across all in-flight instructions, but TEA only increases per-core power consumption by ~3.2 mW (~0.1%) because we carefully select events to balance insight and overhead. TEA leverages statistical sampling to keep performance overhead at 1.1% on average while incurring an average error of 2.1% compared to a non-sampling golden reference; a significant improvement upon the 55.6%, 55.5%, and 56.0% average error for AMD IBS, Arm SPE, and IBM RIS. We demonstrate that TEA's accuracy matters by using TEA to identify performance issues in the SPEC CPU2017 benchmarks lbm and nab that, once addressed, yield speedups of 1.28× and 2.45×, respectively.en_US
dc.language.isoengen_US
dc.publisherACMen_US
dc.relation.ispartofProceedings of the 50th Annual International Symposium on Computer Architecture (ISCA'23)
dc.rightsNavngivelse 4.0 Internasjonal*
dc.titleTEA: Time-Proportional Event Analysisen_US
dc.title.alternativeTEA: Time-Proportional Event Analysisen_US
dc.typeChapteren_US
dc.description.versionacceptedVersionen_US
dc.source.pagenumber315-327en_US
dc.identifier.doi10.1145/3579371.3589058
dc.identifier.cristin2159680
dc.relation.projectNorges forskningsråd: 286596en_US
cristin.ispublishedtrue
cristin.fulltextpostprint
cristin.qualitycode1


Tilhørende fil(er)

Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel