Vis enkel innførsel

dc.contributor.authorAsheim, Truls
dc.contributor.authorGrot, Boris
dc.contributor.authorKumar, Rakesh
dc.date.accessioned2022-10-24T10:39:38Z
dc.date.available2022-10-24T10:39:38Z
dc.date.created2021-10-29T07:38:13Z
dc.date.issued2021
dc.identifier.citationIEEE computer architecture letters. 2021, 20 (2), 134-137.en_US
dc.identifier.issn1556-6056
dc.identifier.urihttps://hdl.handle.net/11250/3027840
dc.description.abstractMany contemporary applications feature multi-megabyte instruction footprints that overwhelm the capacity of branch target buffers (BTB) and instruction caches (L1-I), causing frequent front-end stalls that inevitably hurt performance. BTB is crucial for performance as it enables the front-end to accurately resolve the upcoming execution path and steer instruction fetch appropriately. Moreover, it also enables highly effective fetch-directed instruction prefetching that can eliminate many L1-I misses. For these reasons, commercial processors allocate vast amounts of storage capacity to BTBs. This letter aims to reduce BTB storage requirements by optimizing the organization of BTB entries. Our key insight is that today's BTBs store the full target address for each branch, yet the vast majority of dynamic branches have short offsets requiring just a handful of bits to encode. Based on this insight, we organize the BTB as an ensemble of smaller BTBs, each storing offsets within a particular range. Doing so enables a dramatic reduction in storage for target addresses. We also compress tags to reduce the tag storage cost. Our final design, called BTB-X, uses an ensemble of five BTBs with compressed tags that enables it to track 2.8x more branches than a conventional BTB with the same storage budget.en_US
dc.language.isoengen_US
dc.publisherInstitute of Electrical and Electronics Engineers (IEEE)en_US
dc.titleBTB-X: A Storage-Effective BTB Organizationen_US
dc.typePeer revieweden_US
dc.typeJournal articleen_US
dc.description.versionacceptedVersionen_US
dc.rights.holder© IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.en_US
dc.source.pagenumber134-137en_US
dc.source.volume20en_US
dc.source.journalIEEE computer architecture lettersen_US
dc.source.issue2en_US
dc.identifier.doi10.1109/LCA.2021.3109945
dc.identifier.cristin1949474
dc.relation.projectNorges forskningsråd: 302279en_US
dc.relation.projectNotur/NorStore: NN4650Ken_US
cristin.ispublishedtrue
cristin.fulltextpreprint
cristin.qualitycode1


Tilhørende fil(er)

Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel