AUGONNET, Cédric; THIBAULT, Samuel; NAMYST, Raymond; WACRENIER, Pierre-André

La plateforme OSKAR Bordeaux évolue pour rejoindre l'archive ouverte HAL. Retrouvez tous vos dépôts sur le nouveau portail HAL UB : https://u-bordeaux.hal.science/. Pour toute aide ou information, contactez-nous info@oskar-bordeaux.fr

Métadonnées

Afficher la notice complète

Licence d’utilisation du document

AUGONNET, Cédric
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
Efficient runtime systems for parallel architectures [RUNTIME]

THIBAULT, Samuel
Laboratoire Bordelais de Recherche en Informatique [LaBRI]

NAMYST, Raymond
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
Efficient runtime systems for parallel architectures [RUNTIME]

Langue

Communication dans un congrès

Ce document a été publié dans

Euro-Par 2009, 2009-08, Delft. 2009

Résumé en anglais

In the field of HPC, the current hardware trend is to design multiprocessor architectures that feature heterogeneous technologies such as specialized coprocessors (eg, Cell/BE SPUs) or data-parallel accelerators (eg, GPGPUs). Approaching the theoretical performance of these architectures is a complex issue. Indeed, substantial efforts have already been devoted to efficiently offload parts of the computations. However, designing an execution model that unifies all computing units and associated embedded memory remains a main challenge. We have thus designed StarPU, an original runtime system providing a high-level, unified execution model tightly coupled with an expressive data management library. The main goal of StarPU is to provide numerical kernel designers with a convenient way to generate parallel tasks over heterogeneous hardware on the one hand, and easily develop and tune powerful scheduling algorithms on the other hand. We have developed several strategies that can be selected seamlessly at run time, and we have demonstrated their efficiency by analyzing the impact of those scheduling policies on several classical linear algebra algorithms that take advantage of multiple cores and GPUs at the same time. In addition to substantial improvements regarding execution times, we obtained consistent superlinear parallelism by actually exploiting the heterogeneous nature of the machine.< Réduire

Métadonnées

Partager cette publication !

Licence d’utilisation du document

StarPU: A Unified Platform for Task Scheduling on Heterogeneous Multicore Architectures

Langue

Ce document a été publié dans

Résumé en anglais

URI

Origine

Unités de recherche