Symbolic Mapping and Allocation for the Cholesky Factorization on NUMA machines: Results and Optimizations
JEANNOT, Emmanuel
Efficient runtime systems for parallel architectures [RUNTIME]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
Efficient runtime systems for parallel architectures [RUNTIME]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
JEANNOT, Emmanuel
Efficient runtime systems for parallel architectures [RUNTIME]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
< Reduce
Efficient runtime systems for parallel architectures [RUNTIME]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
Language
en
Article de revue
This item was published in
International Journal of High Performance Computing Applications. 2013, vol. 27, n° 3, p. 283--290
SAGE Publications
English Abstract
We discuss some performance issues of the tiled Cholesky factorization on non-uniform memory access-time (NUMA) shared memory machines. We show how to optimize thread and data placement in order to achieve performance gains ...Read more >
We discuss some performance issues of the tiled Cholesky factorization on non-uniform memory access-time (NUMA) shared memory machines. We show how to optimize thread and data placement in order to achieve performance gains up to 50% compared to state-of- the-art libraries such as PLASMA or MKL.Read less <
Origin
Hal imported