Vis enkel innførsel

dc.contributor.advisorElster, Anne Cathrinenb_NO
dc.contributor.authorJensen, Rune Erlendnb_NO
dc.date.accessioned2014-12-19T13:33:54Z
dc.date.available2014-12-19T13:33:54Z
dc.date.created2010-09-04nb_NO
dc.date.issued2009nb_NO
dc.identifier348746nb_NO
dc.identifierntnudaim:4165nb_NO
dc.identifier.urihttp://hdl.handle.net/11250/251322
dc.description.abstractThis thesis describes novel techniques and test implementations for optimizing numerically intensive codes. Our main focus is on how given algorithms can be adapted to run efficiently on modern microprocessor exploring several architectural features including, instruction selection, and access patterns related to having several levels of cache. Our approach is also shown to be relevant for multicore architectures. Our primary target applications are linear algebra routines in the form of matrix multiply with dense matrices. We analyze how current compilers, microprocessor and common optimization techniques (like loop tiling and date relocation) interact. A tunable assembly code generator is developed, built, and tested on a basic BLAS level-3 routine to side-step some of the performance issues of modern compilers. Our generator has been test on both the Intel Pentium 4 and Intel's Core 2 processors. For the Pentium 4, a 10.8 % speed-up is achieved over ATLAS's rank2k, and a 17% speed-up is achieved over MKL's implementation for 4000-by-4032 matrices. On the Core 2 we optimize our code for 2000-by-2000 matrices and achieved a 24% and 5% speed-up over ATLAS and MKL, respectively with our multi-threaded implementation. Also for other matrix sizes, descent speed-ups are shown. Considering that our implementation is far from fully tuned, we consider these result very respectable.nb_NO
dc.languageengnb_NO
dc.publisherInstitutt for datateknikk og informasjonsvitenskapnb_NO
dc.subjectntnudaimno_NO
dc.subjectMIT informatikkno_NO
dc.subjectKomplekse datasystemerno_NO
dc.titleTechniques and Tools for Optimizing Codes on Modern Architectures:: A Low-Level Approachnb_NO
dc.typeMaster thesisnb_NO
dc.source.pagenumber155nb_NO
dc.contributor.departmentNorges teknisk-naturvitenskapelige universitet, Fakultet for informasjonsteknologi, matematikk og elektroteknikk, Institutt for datateknikk og informasjonsvitenskapnb_NO


Tilhørende fil(er)

Thumbnail
Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel