The system will be going down for regular maintenance. Please save your work and logout.
Performance Analysis and Optimization of the Tiled Cholesky Factorization on NUMA Machines
JEANNOT, Emmanuel
Efficient runtime systems for parallel architectures [RUNTIME]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
Efficient runtime systems for parallel architectures [RUNTIME]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
JEANNOT, Emmanuel
Efficient runtime systems for parallel architectures [RUNTIME]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
< Reduce
Efficient runtime systems for parallel architectures [RUNTIME]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
Language
en
Communication dans un congrès
This item was published in
PAAP 2012 - IEEE International Symposium on Parallel Architectures, Algorithms and Programming, 2012-12-17, Taipei. 2012-12
IEEE
English Abstract
We discuss some performance issues of the tiled Cholesky factorization on non-uniform memory access-time (NUMA) shared memory machines. We show how to optimize thread placement and data placement in order to achieve ...Read more >
We discuss some performance issues of the tiled Cholesky factorization on non-uniform memory access-time (NUMA) shared memory machines. We show how to optimize thread placement and data placement in order to achieve performance gain up to 50% compared to state-of-the-art libraries such as Plasma or MKL.Read less <
Origin
Hal imported