FAVERGE, Mathieu; LACOSTE, Xavier; RAMET, Pierre

Métadonnées

Licence d’utilisation du document

FAVERGE, Mathieu
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
Efficient runtime systems for parallel architectures [RUNTIME]
Algorithms and high performance computing for grand challenge applications [SCALAPPLIX]

LACOSTE, Xavier
Algorithms and high performance computing for grand challenge applications [SCALAPPLIX]

RAMET, Pierre
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
Algorithms and high performance computing for grand challenge applications [SCALAPPLIX]

Langue

Communication dans un congrès

Ce document a été publié dans

PMAA'08, 2008, Neuchâtel. 2008

Résumé en anglais

Over the past few years, parallel sparse direct solvers made significant progress and are now able to solve efficiently industrial three-dimensional problems with several millions of unknowns. An hybrid MPI-thread implementation of our direct solver PaStiX is already well suited for SMP nodes or new multi-core architectures and drastically reduced the memory overhead and improved scalability. In the context of distributed NUMA architectures, a dynamic scheduler based on a work-stealing algorithm has been developed to fill in communication idle times. On these architectures, it is important to take care of NUMA effects and to preserve memory affinity during the work-stealing. The scheduling of communications also needs to be adapted, especially to ensure the overlap by computations. Experiments on numerical test cases will be presented to prove the efficiency of the approach on NUMA architectures. If memory is not large enough to treat a given problem, disks must be used to store data that cannot fit in memory (out-of-core storage). The idle-times due to disk access have to be managed by our dynamic scheduler to prefetch and save datasets. Thus, we design and study specific scheduling algorithms in this particular context.< Réduire

Mots clés en anglais

sparse direct solver

NUMA architecture

multi-cores

dynamic scheduling

URI

https://oskar-bordeaux.fr/handle/20.500.12278/198590

Project ANR

Adaptation et Optimisation des Performances Applicatives sur architectures NUMA. Etude et Mise en Œuvre sur des Applications en SISmologie. - ANR-05-CIGC-0002

Origine

Importé de hal

Unités de recherche

Laboratoire Bordelais de Recherche en Informatique (LaBRI) - UMR 5800