Symbolic Mapping and Allocation for the Cholesky Factorization on NUMA machines: Results and Optimizations
JEANNOT, Emmanuel
Efficient runtime systems for parallel architectures [RUNTIME]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
Efficient runtime systems for parallel architectures [RUNTIME]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
JEANNOT, Emmanuel
Efficient runtime systems for parallel architectures [RUNTIME]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
< Réduire
Efficient runtime systems for parallel architectures [RUNTIME]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
Langue
en
Article de revue
Ce document a été publié dans
International Journal of High Performance Computing Applications. 2013, vol. 27, n° 3, p. 283--290
SAGE Publications
Résumé en anglais
We discuss some performance issues of the tiled Cholesky factorization on non-uniform memory access-time (NUMA) shared memory machines. We show how to optimize thread and data placement in order to achieve performance gains ...Lire la suite >
We discuss some performance issues of the tiled Cholesky factorization on non-uniform memory access-time (NUMA) shared memory machines. We show how to optimize thread and data placement in order to achieve performance gains up to 50% compared to state-of- the-art libraries such as PLASMA or MKL.< Réduire
Origine
Importé de halUnités de recherche