TEA: Time-Proportional Event Analysis

Gottschall, Björn; Eeckhout, Lieven; Jahre, Magnus

dc.contributor.author	Gottschall, Björn
dc.contributor.author	Eeckhout, Lieven
dc.contributor.author	Jahre, Magnus
dc.date.accessioned	2023-08-16T09:09:57Z
dc.date.available	2023-08-16T09:09:57Z
dc.date.created	2023-06-30T09:22:58Z
dc.date.issued	2023
dc.identifier.isbn	979-8-4007-0095-8
dc.identifier.uri	https://hdl.handle.net/11250/3084348
dc.description.abstract	As computer architectures become increasingly complex and heterogeneous, it becomes progressively more difficult to write applications that make good use of hardware resources. Performance analysis tools are hence critically important as they are the only way through which developers can gain insight into the reasons why their application performs as it does. State-of-the-art performance analysis tools capture a plethora of performance events and are practically non-intrusive, but performance optimization is still extremely challenging. We believe that the fundamental reason is that current state-of-the-art tools in general cannot explain why executing the application's performance-critical instructions take time. We hence propose Time-Proportional Event Analysis (TEA) which explains why the architecture spends time executing the application's performance-critical instructions by creating time-proportional Per-Instruction Cycle Stacks (PICS). PICS unify performance profiling and performance event analysis, and thereby (i) report the contribution of each static instruction to overall execution time, and (ii) break down per-instruction execution time across the (combinations of) performance events that a static instruction was subjected to across its dynamic executions. Creating time-proportional PICS requires tracking performance events across all in-flight instructions, but TEA only increases per-core power consumption by ~3.2 mW (~0.1%) because we carefully select events to balance insight and overhead. TEA leverages statistical sampling to keep performance overhead at 1.1% on average while incurring an average error of 2.1% compared to a non-sampling golden reference; a significant improvement upon the 55.6%, 55.5%, and 56.0% average error for AMD IBS, Arm SPE, and IBM RIS. We demonstrate that TEA's accuracy matters by using TEA to identify performance issues in the SPEC CPU2017 benchmarks lbm and nab that, once addressed, yield speedups of 1.28× and 2.45×, respectively.	en_US
dc.language.iso	eng	en_US
dc.publisher	ACM	en_US
dc.relation.ispartof	Proceedings of the 50th Annual International Symposium on Computer Architecture (ISCA'23)
dc.rights	Navngivelse 4.0 Internasjonal	*
dc.title	TEA: Time-Proportional Event Analysis	en_US
dc.title.alternative	TEA: Time-Proportional Event Analysis	en_US
dc.type	Chapter	en_US
dc.description.version	acceptedVersion	en_US
dc.source.pagenumber	315-327	en_US
dc.identifier.doi	10.1145/3579371.3589058
dc.identifier.cristin	2159680
dc.relation.project	Norges forskningsråd: 286596	en_US
cristin.ispublished	true
cristin.fulltext	postprint
cristin.qualitycode	1

Tilhørende fil(er)

Filnavn:: tea-isca23-final-author-copy.pdf
Størrelse:: 877.1Kb
Format:: PDF

Åpne

Denne innførselen finnes i følgende samling(er)

Institutt for datateknologi og informatikk [6772]
Publikasjoner fra CRIStin - NTNU [38070]

Vis enkel innførsel