Show simple item record

hal.structure.identifierLaboratoire Bordelais de Recherche en Informatique [LaBRI]
hal.structure.identifierEfficient runtime systems for parallel architectures [RUNTIME]
dc.contributor.authorAUGONNET, Cédric
hal.structure.identifierLaboratoire Bordelais de Recherche en Informatique [LaBRI]
hal.structure.identifierEfficient runtime systems for parallel architectures [RUNTIME]
dc.contributor.authorTHIBAULT, Samuel
hal.structure.identifierLaboratoire Bordelais de Recherche en Informatique [LaBRI]
hal.structure.identifierEfficient runtime systems for parallel architectures [RUNTIME]
dc.contributor.authorNAMYST, Raymond
hal.structure.identifierLaboratoire Bordelais de Recherche en Informatique [LaBRI]
hal.structure.identifierEfficient runtime systems for parallel architectures [RUNTIME]
dc.contributor.authorWACRENIER, Pierre-André
dc.date.accessioned2024-04-15T09:48:04Z
dc.date.available2024-04-15T09:48:04Z
dc.date.issued2011
dc.identifier.issn1532-0626
dc.identifier.urihttps://oskar-bordeaux.fr/handle/20.500.12278/198149
dc.description.abstractEnIn the field of HPC, the current hardware trend is to design multiprocessor architectures featuring heterogeneous technologies such as specialized coprocessors (e.g., Cell/BE) or data-parallel accelerators (e.g., GPUs). Approaching the theoretical performance of these architectures is a complex issue. Indeed, substantial efforts have already been devoted to efficiently offload parts of the computations. However, designing an execution model that unifies all computing units and associated embedded memory remains a main challenge. We therefore designed StarPU, an original runtime system providing a high-level, unified execution model tightly coupled with an expressive data management library. The main goal of StarPU is to provide numerical kernel designers with a convenient way to generate parallel tasks over heterogeneous hardware on the one hand, and easily develop and tune powerful scheduling algorithms on the other hand. We have developed several strategies that can be selected seamlessly at run-time, and we have analyzed their efficiency on several algorithms running simultaneously over multiple cores and a GPU. In addition to substantial improvements regarding execution times, we have obtained consistent superlinear parallelism by actually exploiting the heterogeneous nature of the machine. We eventually show that our dynamic approach competes with the highly-optimized MAGMA library and overcomes the limitations of the corresponding static scheduling in a portable way.
dc.language.isoen
dc.publisherWiley
dc.title.enStarPU: a unified platform for task scheduling on heterogeneous multicore architectures
dc.typeArticle de revue
dc.identifier.doi10.1002/cpe.1631
dc.subject.halInformatique [cs]/Calcul parallèle, distribué et partagé [cs.DC]
bordeaux.journalConcurrency and Computation: Practice and Experience
bordeaux.page187-198
bordeaux.volume23
bordeaux.hal.laboratoriesLaboratoire Bordelais de Recherche en Informatique (LaBRI) - UMR 5800*
bordeaux.issue2
bordeaux.institutionUniversité de Bordeaux
bordeaux.institutionBordeaux INP
bordeaux.institutionCNRS
bordeaux.peerReviewedoui
hal.identifierinria-00550877
hal.version1
hal.popularnon
hal.audienceInternationale
hal.origin.linkhttps://hal.archives-ouvertes.fr//inria-00550877v1
bordeaux.COinSctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.jtitle=Concurrency%20and%20Computation:%20Practice%20and%20Experience&rft.date=2011&rft.volume=23&rft.issue=2&rft.spage=187-198&rft.epage=187-198&rft.eissn=1532-0626&rft.issn=1532-0626&rft.au=AUGONNET,%20C%C3%A9dric&THIBAULT,%20Samuel&NAMYST,%20Raymond&WACRENIER,%20Pierre-Andr%C3%A9&rft.genre=article


Files in this item

FilesSizeFormatView

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record