AUGONNET, Cédric; THIBAULT, Samuel; NAMYST, Raymond; WACRENIER, Pierre-André

doi:10.1002/cpe.1631

hal.structure.identifier	Laboratoire Bordelais de Recherche en Informatique [LaBRI]
hal.structure.identifier	Efficient runtime systems for parallel architectures [RUNTIME]
dc.contributor.author	AUGONNET, Cédric
hal.structure.identifier	Laboratoire Bordelais de Recherche en Informatique [LaBRI]
hal.structure.identifier	Efficient runtime systems for parallel architectures [RUNTIME]
dc.contributor.author	THIBAULT, Samuel
hal.structure.identifier	Laboratoire Bordelais de Recherche en Informatique [LaBRI]
hal.structure.identifier	Efficient runtime systems for parallel architectures [RUNTIME]
dc.contributor.author	NAMYST, Raymond
hal.structure.identifier	Laboratoire Bordelais de Recherche en Informatique [LaBRI]
hal.structure.identifier	Efficient runtime systems for parallel architectures [RUNTIME]
dc.contributor.author	WACRENIER, Pierre-André
dc.date.accessioned	2024-04-15T09:48:04Z
dc.date.available	2024-04-15T09:48:04Z
dc.date.issued	2011
dc.identifier.issn	1532-0626
dc.identifier.uri	https://oskar-bordeaux.fr/handle/20.500.12278/198149
dc.description.abstractEn	In the field of HPC, the current hardware trend is to design multiprocessor architectures featuring heterogeneous technologies such as specialized coprocessors (e.g., Cell/BE) or data-parallel accelerators (e.g., GPUs). Approaching the theoretical performance of these architectures is a complex issue. Indeed, substantial efforts have already been devoted to efficiently offload parts of the computations. However, designing an execution model that unifies all computing units and associated embedded memory remains a main challenge. We therefore designed StarPU, an original runtime system providing a high-level, unified execution model tightly coupled with an expressive data management library. The main goal of StarPU is to provide numerical kernel designers with a convenient way to generate parallel tasks over heterogeneous hardware on the one hand, and easily develop and tune powerful scheduling algorithms on the other hand. We have developed several strategies that can be selected seamlessly at run-time, and we have analyzed their efficiency on several algorithms running simultaneously over multiple cores and a GPU. In addition to substantial improvements regarding execution times, we have obtained consistent superlinear parallelism by actually exploiting the heterogeneous nature of the machine. We eventually show that our dynamic approach competes with the highly-optimized MAGMA library and overcomes the limitations of the corresponding static scheduling in a portable way.
dc.language.iso	en
dc.publisher	Wiley
dc.title.en	StarPU: a unified platform for task scheduling on heterogeneous multicore architectures
dc.type	Article de revue
dc.identifier.doi	10.1002/cpe.1631
dc.subject.hal	Informatique [cs]/Calcul parallèle, distribué et partagé [cs.DC]
bordeaux.journal	Concurrency and Computation: Practice and Experience
bordeaux.page	187-198
bordeaux.volume	23
bordeaux.hal.laboratories	Laboratoire Bordelais de Recherche en Informatique (LaBRI) - UMR 5800	*
bordeaux.issue	2
bordeaux.institution	Université de Bordeaux
bordeaux.institution	Bordeaux INP
bordeaux.institution	CNRS
bordeaux.peerReviewed	oui
hal.identifier	inria-00550877
hal.version	1
hal.popular	non
hal.audience	Internationale
hal.origin.link	https://hal.archives-ouvertes.fr//inria-00550877v1
bordeaux.COinS	ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.jtitle=Concurrency%20and%20Computation:%20Practice%20and%20Experience&rft.date=2011&rft.volume=23&rft.issue=2&rft.spage=187-198&rft.epage=187-198&rft.eissn=1532-0626&rft.issn=1532-0626&rft.au=AUGONNET,%20C%C3%A9dric&THIBAULT,%20Samuel&NAMYST,%20Raymond&WACRENIER,%20Pierre-Andr%C3%A9&rft.genre=article

Files in this item

Files	Size	Format	View
There are no files associated with this item.

This item appears in the following Collection(s)

Laboratoire Bordelais de Recherche en Informatique (LaBRI) - UMR 5800

Show simple item record

StarPU: a unified platform for task scheduling on heterogeneous multicore architectures

Files in this item

This item appears in the following Collection(s)