dc.contributor.author | Asheim, Truls | |
dc.contributor.author | Grot, Boris | |
dc.contributor.author | Kumar, Rakesh | |
dc.date.accessioned | 2022-10-24T10:39:38Z | |
dc.date.available | 2022-10-24T10:39:38Z | |
dc.date.created | 2021-10-29T07:38:13Z | |
dc.date.issued | 2021 | |
dc.identifier.citation | IEEE computer architecture letters. 2021, 20 (2), 134-137. | en_US |
dc.identifier.issn | 1556-6056 | |
dc.identifier.uri | https://hdl.handle.net/11250/3027840 | |
dc.description.abstract | Many contemporary applications feature multi-megabyte instruction footprints that overwhelm the capacity of branch target buffers (BTB) and instruction caches (L1-I), causing frequent front-end stalls that inevitably hurt performance. BTB is crucial for performance as it enables the front-end to accurately resolve the upcoming execution path and steer instruction fetch appropriately. Moreover, it also enables highly effective fetch-directed instruction prefetching that can eliminate many L1-I misses. For these reasons, commercial processors allocate vast amounts of storage capacity to BTBs. This letter aims to reduce BTB storage requirements by optimizing the organization of BTB entries. Our key insight is that today's BTBs store the full target address for each branch, yet the vast majority of dynamic branches have short offsets requiring just a handful of bits to encode. Based on this insight, we organize the BTB as an ensemble of smaller BTBs, each storing offsets within a particular range. Doing so enables a dramatic reduction in storage for target addresses. We also compress tags to reduce the tag storage cost. Our final design, called BTB-X, uses an ensemble of five BTBs with compressed tags that enables it to track 2.8x more branches than a conventional BTB with the same storage budget. | en_US |
dc.language.iso | eng | en_US |
dc.publisher | Institute of Electrical and Electronics Engineers (IEEE) | en_US |
dc.title | BTB-X: A Storage-Effective BTB Organization | en_US |
dc.type | Peer reviewed | en_US |
dc.type | Journal article | en_US |
dc.description.version | acceptedVersion | en_US |
dc.rights.holder | © IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. | en_US |
dc.source.pagenumber | 134-137 | en_US |
dc.source.volume | 20 | en_US |
dc.source.journal | IEEE computer architecture letters | en_US |
dc.source.issue | 2 | en_US |
dc.identifier.doi | 10.1109/LCA.2021.3109945 | |
dc.identifier.cristin | 1949474 | |
dc.relation.project | Norges forskningsråd: 302279 | en_US |
dc.relation.project | Notur/NorStore: NN4650K | en_US |
cristin.ispublished | true | |
cristin.fulltext | preprint | |
cristin.qualitycode | 1 | |