A Benchmark-based Performance Model for Memory-bound HPC Applications
PUTIGNY, Bertrand
Efficient runtime systems for parallel architectures [RUNTIME]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
Efficient runtime systems for parallel architectures [RUNTIME]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
GOGLIN, Brice
Efficient runtime systems for parallel architectures [RUNTIME]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
Efficient runtime systems for parallel architectures [RUNTIME]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
BARTHOU, Denis
Efficient runtime systems for parallel architectures [RUNTIME]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
Efficient runtime systems for parallel architectures [RUNTIME]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
PUTIGNY, Bertrand
Efficient runtime systems for parallel architectures [RUNTIME]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
Efficient runtime systems for parallel architectures [RUNTIME]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
GOGLIN, Brice
Efficient runtime systems for parallel architectures [RUNTIME]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
Efficient runtime systems for parallel architectures [RUNTIME]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
BARTHOU, Denis
Efficient runtime systems for parallel architectures [RUNTIME]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
< Reduce
Efficient runtime systems for parallel architectures [RUNTIME]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
Language
en
Communication dans un congrès
This item was published in
International Conference on High Performance Computing & Simulation (HPCS 2014), 2014-07-21, Bologna. 2014-07-21
IEEE
English Abstract
The increasing computation capability of servers comes with a dramatic increase of their complexity through many cores, multiple levels of caches and NUMA architectures. Exploiting the computing power is increasingly harder ...Read more >
The increasing computation capability of servers comes with a dramatic increase of their complexity through many cores, multiple levels of caches and NUMA architectures. Exploiting the computing power is increasingly harder and programmers need ways to understand the performance behavior. We present an innovative approach for predicting the performance of memory-bound multi-threaded applications. It relies on micro-benchmarks and a compositional model, combining measures of micro-benchmarks in order to model larger codes. Our memory model takes into account cache sizes and cache coherence protocols, having a large impact on performance of multi-threaded codes. Applying this model to real world HPC kernels shows that it can predict their performance with good accuracy, helping taking optimization decisions to increase application's performance.Read less <
English Keywords
multicore
memory model
timing prediction
micro-benchmarks
caches
multicore.
Origin
Hal imported