Freeway: Maximizing MLP for Slice-Out-of-Order Execution

Kumar, Rakesh; Alipour, Mehdi; Black-Schaffer, David

dc.contributor.author	Kumar, Rakesh
dc.contributor.author	Alipour, Mehdi
dc.contributor.author	Black-Schaffer, David
dc.date.accessioned	2020-01-22T07:24:56Z
dc.date.available	2020-01-22T07:24:56Z
dc.date.created	2019-12-27T18:00:48Z
dc.date.issued	2019
dc.identifier.citation	IEEE Symposium on High-Performance Computer Architecture (HPCA). 2019, 558-569.	nb_NO
dc.identifier.issn	1530-0897
dc.identifier.uri	http://hdl.handle.net/11250/2637353
dc.description.abstract	Exploiting memory level parallelism (MLP) is crucial to hide long memory and last level cache access latencies. While out-of-order (OoO) cores, and techniques building on them, are effective at exploiting MLP, they deliver poor energy efficiency due to their complex hardware and the resulting energy overheads. As energy efficiency becomes the prime design constraint, we investigate low complexity/energy mechanisms to exploit MLP. This work revisits slice-out-of-order (sOoO) cores as an energy efficient alternative to OoO cores for MLP exploitation. These cores construct slices of MLP generating instructions and execute them out-of-order with respect to the rest of instructions. However, the slices and the remaining instructions, by themselves, execute in-order. Though their energy overhead is low compared to full OoO cores, sOoO cores fall considerably behind in terms of MLP extraction. We observe that their dependence-oblivious inorder slice execution causes dependent slices to frequently block MLP generation. To boost MLP generation in sOoO cores, we introduce Freeway, a sOoO core based on a new dependence-aware slice execution policy that tracks dependent slices and keeps them out of the way of MLP extraction. The proposed core incurs minimal area and power overheads, yet approaches the MLP benefits of fully OoO cores. Our evaluation shows that Freeway outperforms the state-of-the-art sOoO core by 12% and is within 7% of the MLP limits of full OoO execution.	nb_NO
dc.language.iso	eng	nb_NO
dc.publisher	Institute of Electrical and Electronics Engineers (IEEE)	nb_NO
dc.title	Freeway: Maximizing MLP for Slice-Out-of-Order Execution	nb_NO
dc.type	Journal article	nb_NO
dc.type	Peer reviewed	nb_NO
dc.description.version	acceptedVersion	nb_NO
dc.source.pagenumber	558-569	nb_NO
dc.source.journal	IEEE Symposium on High-Performance Computer Architecture (HPCA)	nb_NO
dc.identifier.doi	10.1109/HPCA.2019.00009
dc.identifier.cristin	1764009
dc.description.localcode	© 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.	nb_NO
cristin.unitcode	194,63,10,0
cristin.unitname	Institutt for datateknologi og informatikk
cristin.ispublished	true
cristin.fulltext	original
cristin.qualitycode	2

Tilhørende fil(er)

Filnavn:: Freeway_HPCA19.pdf
Størrelse:: 1.884Mb
Format:: PDF
Beskrivelse:: Kumar

Åpne

Denne innførselen finnes i følgende samling(er)

Institutt for datateknologi og informatikk [6560]
Publikasjoner fra CRIStin - NTNU [37325]

Vis enkel innførsel