Show simple item record

hal.structure.identifierEfficient runtime systems for parallel architectures [RUNTIME]
hal.structure.identifierLaboratoire Bordelais de Recherche en Informatique [LaBRI]
dc.contributor.authorHUGO, Andra-Ecaterina
hal.structure.identifierLaboratoire Bordelais de Recherche en Informatique [LaBRI]
hal.structure.identifierHigh-End Parallel Algorithms for Challenging Numerical Simulations [HiePACS]
dc.contributor.authorGUERMOUCHE, Abdou
hal.structure.identifierEfficient runtime systems for parallel architectures [RUNTIME]
hal.structure.identifierLaboratoire Bordelais de Recherche en Informatique [LaBRI]
dc.contributor.authorNAMYST, Raymond
hal.structure.identifierEfficient runtime systems for parallel architectures [RUNTIME]
hal.structure.identifierLaboratoire Bordelais de Recherche en Informatique [LaBRI]
dc.contributor.authorWACRENIER, Pierre-André
dc.date.accessioned2024-04-15T09:43:23Z
dc.date.available2024-04-15T09:43:23Z
dc.date.issued2013-05-20
dc.date.conference2013-05-20
dc.identifier.urihttps://oskar-bordeaux.fr/handle/20.500.12278/197762
dc.description.abstractEnEnabling HPC applications to perform efficiently when invoking multiple parallel libraries simultaneously is a great challenge. Even if a single runtime system is used underneath, scheduling tasks or threads coming from different libraries over the same set of hardware resources introduces many issues, such as resource oversubscription, undesirable cache flushes or memory bus contention. This paper presents an extension of StarPU, a runtime system specifically designed for heterogeneous architectures, that allows multiple parallel codes to run concurrently with minimal interference. Such parallel codes run within scheduling contexts that provide confined execution environments which can be used to partition computing resources. Scheduling contexts can be dynamically resized to optimize the allocation of computing resources among concurrently running libraries. We introduce a hypervisor that automatically expands or shrinks contexts using feedback from the runtime system (e.g. resource utilization). We demonstrate the relevance of our approach using benchmarks invoking multiple high performance linear algebra kernels simultaneously on top of heterogeneous multicore machines. We show that our mechanism can dramatically improve the overall application run time (-34%), most notably by reducing the average cache miss ratio (-50%).
dc.language.isoen
dc.title.enComposing multiple StarPU applications over heterogeneous machines: a supervised approach
dc.typeCommunication dans un congrès
dc.subject.halInformatique [cs]/Calcul parallèle, distribué et partagé [cs.DC]
bordeaux.hal.laboratoriesLaboratoire Bordelais de Recherche en Informatique (LaBRI) - UMR 5800*
bordeaux.institutionUniversité de Bordeaux
bordeaux.institutionBordeaux INP
bordeaux.institutionCNRS
bordeaux.conference.titleThird International Workshop on Accelerators and Hybrid Exascale Systems
bordeaux.countryUS
bordeaux.conference.cityBoston
bordeaux.peerReviewedoui
hal.identifierhal-00824514
hal.version1
hal.invitednon
hal.proceedingsoui
hal.popularnon
hal.audienceInternationale
hal.origin.linkhttps://hal.archives-ouvertes.fr//hal-00824514v1
bordeaux.COinSctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.date=2013-05-20&rft.au=HUGO,%20Andra-Ecaterina&GUERMOUCHE,%20Abdou&NAMYST,%20Raymond&WACRENIER,%20Pierre-Andr%C3%A9&rft.genre=unknown


Files in this item

FilesSizeFormatView

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record