Show simple item record

hal.structure.identifierLaboratoire Bordelais de Recherche en Informatique [LaBRI]
hal.structure.identifierEfficient runtime systems for parallel architectures [RUNTIME]
dc.contributor.authorBROQUEDIS, François
hal.structure.identifierLaboratoire Bordelais de Recherche en Informatique [LaBRI]
hal.structure.identifierEfficient runtime systems for parallel architectures [RUNTIME]
dc.contributor.authorFURMENTO, Nathalie
hal.structure.identifierLaboratoire Bordelais de Recherche en Informatique [LaBRI]
hal.structure.identifierEfficient runtime systems for parallel architectures [RUNTIME]
dc.contributor.authorGOGLIN, Brice
hal.structure.identifierLaboratoire Bordelais de Recherche en Informatique [LaBRI]
hal.structure.identifierEfficient runtime systems for parallel architectures [RUNTIME]
dc.contributor.authorWACRENIER, Pierre-André
hal.structure.identifierLaboratoire Bordelais de Recherche en Informatique [LaBRI]
hal.structure.identifierEfficient runtime systems for parallel architectures [RUNTIME]
dc.contributor.authorNAMYST, Raymond
dc.date.accessioned2024-04-15T09:49:21Z
dc.date.available2024-04-15T09:49:21Z
dc.date.issued2010
dc.identifier.issn0885-7458
dc.identifier.urihttps://oskar-bordeaux.fr/handle/20.500.12278/198248
dc.description.abstractEnExploiting the full computational power of current hierarchical multiprocessor machines requires a very careful distribution of threads and data among the underlying non-uniform architecture so as to avoid remote memory access penalties. Directive-based programming languages such as OpenMP, can greatly help to perform such a distribution by providing programmers with an easy way to structure the parallelism of their application and to transmit this information to the runtime system. Our runtime, which is based on a multi-level thread scheduler combined with a NUMA-aware memory manager, converts this information into Scheduling Hints related to thread-memory affinity issues. These hints enable dynamic load distribution guided by application structure and hardware topology, thus helping to achieve performance portability. Several experiments show that mixed solutions (migrating both threads and data) outperform work-stealing based balancing strategies and Next-Touch-based data distribution policies. These techniques provide insights about additional optimizations.
dc.language.isoen
dc.publisherSpringer Verlag
dc.title.enForestGOMP: an efficient OpenMP environment for NUMA architectures
dc.typeArticle de revue
dc.identifier.doi10.1007/s10766-010-0136-3
dc.subject.halInformatique [cs]/Système d'exploitation [cs.OS]
bordeaux.journalInternational Journal of Parallel Programming
bordeaux.hal.laboratoriesLaboratoire Bordelais de Recherche en Informatique (LaBRI) - UMR 5800*
bordeaux.institutionUniversité de Bordeaux
bordeaux.institutionBordeaux INP
bordeaux.institutionCNRS
bordeaux.peerReviewedoui
hal.identifierinria-00496295
hal.version1
hal.popularnon
hal.audienceInternationale
hal.origin.linkhttps://hal.archives-ouvertes.fr//inria-00496295v1
bordeaux.COinSctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.jtitle=International%20Journal%20of%20Parallel%20Programming&rft.date=2010&rft.eissn=0885-7458&rft.issn=0885-7458&rft.au=BROQUEDIS,%20Fran%C3%A7ois&FURMENTO,%20Nathalie&GOGLIN,%20Brice&WACRENIER,%20Pierre-Andr%C3%A9&NAMYST,%20Raymond&rft.genre=article


Files in this item

FilesSizeFormatView

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record