Browsing NTNU Open by Author "Jahre, Magnus"

A Coarse-Grain Reconfigurable Accelerator for Rocket

Gausaker, Philip (Master thesis, 2022)

Søken etter raskere og mer energi effektive data arkitekturer blir stadig vanskeligere. Dette er et resultat av slutten på Moores lov og Dennard skalering. Data arkitekter ser alternative veier for å oppnå raskere og mer ...

A Comparative Analysis of Shared Cache Management Techniques for Chip Multiprocessors

Grøvdal, Christian Vik (Master thesis, 2013)

In this thesis we present a comparative analysis of shared cache management techniquesfor chip multiprocessors. When sharing an unmanaged cache between multiplecores, destructive interference can reduce the performance of ...

Accelerating LBM on a Tightly-Coupled Field Programmable Gate Array

Vázquez Maceiras, Mateo (Master thesis, 2021)

Det er ikke lenger mulig å anvende Dennard's prinsipper til å skalere integrerte kretser, og det forventes at Moore's lov snart vil opphøre. Dette har ført til en voldsom interesse for nye metoder for å oppnå ytelsesforbedring ...

Accelerating Object Detection for Agricultural Robotics

Boganes, Jørgen (Master thesis, 2020)

Innenfor agrikulturell teknologi - eller agritech - er det å høste inn frukt en dyr og tidkrevende prosess. Dette er vanligvis utført av menneskelig arbeidskraft, og agritech er derfor et felt hvor automatisering har stort ...

Accelerating Sparse Linear Algebra and Deep Neural Networks on Reconfigurable Platforms

Umuroglu, Yaman (Doctoral theses at NTNU;2018:1, Doctoral thesis, 2018)

Regardless of whether the chosen figure of merit is execution time, throughput, battery life for an embedded system or total cost of ownership for a datacenter, today’s computers are fundamentally limited by their energy ...

Balancing Performance Against Cost and Sustainability in Multi-Chip-Module GPUs

Zhang, Shiqing; Naderan-Tahan, Mahmood; Jahre, Magnus; Eeckhout, Lieven (Peer reviewed; Journal article, 2023)

MCM-GPUs scale performance by integrating multiple chiplets within the same package. How to partition the aggregate compute resources across chiplets poses a fundamental trade-off in performance versus cost and sustainability. ...

Challenges in the Realm of Embedded Real-Time Image Processing

Millet, Philippe; Grinberg, Michael; Jahre, Magnus (Chapter, 2021)

The development of power-efficient solutions gives new embedded products the ability to analyse images and thereby brings more intelligence to embedded systems—providing more and better services of higher quality as well ...

Challenges of Reducing Cycle-Accurate Simulation Time for TBP Applications

Iordan, Alexandru Ciprian; Jahre, Magnus; Natvig, Lasse (Journal article; Peer reviewed, 2013)

Cycle-accurate simulation is an important tool that depends on the computational power of supercomputers. Unfortunately, simulations of modern multi-core platforms can take weeks or months. In this paper, we look into the ...

Characterizing Multi-Chip GPU Data Sharing

Zhang, Shiqing; Naderan-Tahan, Mahmood; Jahre, Magnus; Eeckhout, Lieven (Journal article; Peer reviewed, 2023)

Multi-chip Graphics Processing Unit (GPU) systems are critical to scale performance beyond a single GPU chip for a wide variety of important emerging applications. A key challenge for multi-chip GPUs, though, is how to ...

Computing in Unstructured Matter

Lykkebø, Odd Rune S. (Doctoral theses at NTNU;2017:90, Doctoral thesis, 2017)

DCMI: A Scalable Strategy for Accelerating Iterative Stencil Loops on FPGAs

Koraei, Mostafa; Fatemi, Omid; Jahre, Magnus (Journal article; Peer reviewed, 2019)

Iterative Stencil Loops (ISLs) are the key kernel within a range of compute-intensive applications. To accelerate ISLs with Field Programmable Gate Arrays, it is critical to exploit parallelism (1) among elements within ...

Delegated Replies: Alleviating Network Clogging in Heterogeneous Architectures

Zhao, Xia; Eeckhout, Lieven; Jahre, Magnus (Peer reviewed; Journal article, 2022)

Heterogeneous architectures with latency-sensitive CPU cores and bandwidth-intensive accelerators are attractive as they deliver high performance at favorable cost. These architectures typically have significantly more ...

Designing a Virtual Memory System for the SHMAC Research Infrastructure

Sutterud, Audun (Master thesis, 2017)

The Single-ISA Heterogeneous MAny-core Computer (SHMAC) is an infrastructure for realizing heterogeneous computing systems. The current SHMAC prototype does not have a Memory Management Unit (MMU). An MMU would simplify ...

DTP: Enabling Exhaustive Exploration of FPGA Temporal Partitions for Streaming HPC Applications

Koraei, Mostafa; Jahre, Magnus; Fatemi, S. Omid (Chapter; Peer reviewed, 2017)

Reconfigurable computing systems show great promise for accelerating streaming HPC applications because of their low power consumption and high performance. However, mapping an HPC application to a reconfigurable system ...

Evaluating Shared Last Level Cache Partitioning Algorithms

aan de Wiel, Thomas Alexander (Master thesis, 2017)

Over the past few decades, the development of Dynamic Random-Access Memory (DRAM) has mainly focused on increasing capacity and lowering costs. However, microprocessor development has experienced enormous improvements in ...

Evaluating the Energy Consumption of Asset Tracking Applications

Karstad, Ådne (Master thesis, 2022)

Med en fremvekst av energieffektiv cellulær IoT er det avgjørende å forske på forholdet mellom sikkerhet og energiforbruk. I denne oppgaven har jeg implementert en Generisk Asset Tracking Applikasjon for å evaluere avveiningen ...

Evaluation of Cache Management Algorithms for Shared Last Level Caches

Olsen, Runar Bergheim (Master thesis, 2015)

The performance gap between processors and main memory has been growing over the last decades. Fast memory structures know as caches were introduced to mitigate some of the effects of this gap. After processor manufacturers ...

Evolution in Materio: - En Kaotisk Tilnærming

Flogard, Eirik Lund (Master thesis, 2015)

Denne avhandlingen omhandler et konsept kalt Evolution in Materio, der man gjennom datakontrollert evolusjon forsøker å utnytte et materies naturlige egenskaper for å løse oppgaver eller utføre beregninger. Motivasjonen ...

Extending OMPT to Support Grain Graph Visualization

Langdal, Peder Voldnes (Master thesis, 2017)

Because of physical constraints, performance gains of single-core processors has come to a halt. Computer architects have responded by adding multiple processor cores to their designs. However, for continued performance ...

Extending OMPT to Support Grain Graphs

Langdal, Peder Voldnes; Jahre, Magnus; Muddukrishna, Ananya (Journal article, 2017)

The upcoming profiling API standard OMPT can describe almost all profiling events required to construct grain graphs, a recent visualization that simplifies OpenMP performance analysis. We propose OMPT extensions that ...