Symbolic Mapping and Allocation for the Cholesky Factorization on NUMA machines: Results and Optimizations
JEANNOT, Emmanuel
Efficient runtime systems for parallel architectures [RUNTIME]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
Efficient runtime systems for parallel architectures [RUNTIME]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
JEANNOT, Emmanuel
Efficient runtime systems for parallel architectures [RUNTIME]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
< Leer menos
Efficient runtime systems for parallel architectures [RUNTIME]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
Idioma
en
Article de revue
Este ítem está publicado en
International Journal of High Performance Computing Applications. 2013, vol. 27, n° 3, p. 283--290
SAGE Publications
Resumen en inglés
We discuss some performance issues of the tiled Cholesky factorization on non-uniform memory access-time (NUMA) shared memory machines. We show how to optimize thread and data placement in order to achieve performance gains ...Leer más >
We discuss some performance issues of the tiled Cholesky factorization on non-uniform memory access-time (NUMA) shared memory machines. We show how to optimize thread and data placement in order to achieve performance gains up to 50% compared to state-of- the-art libraries such as PLASMA or MKL.< Leer menos
Orígen
Importado de HalCentros de investigación