Show simple item record

hal.structure.identifierLaboratoire Bordelais de Recherche en Informatique [LaBRI]
hal.structure.identifierEfficient runtime systems for parallel architectures [RUNTIME]
dc.contributor.authorBROQUEDIS, François
hal.structure.identifierLaboratoire Bordelais de Recherche en Informatique [LaBRI]
hal.structure.identifierEfficient runtime systems for parallel architectures [RUNTIME]
dc.contributor.authorAUMAGE, Olivier
hal.structure.identifierLaboratoire Bordelais de Recherche en Informatique [LaBRI]
hal.structure.identifierEfficient runtime systems for parallel architectures [RUNTIME]
dc.contributor.authorGOGLIN, Brice
hal.structure.identifierLaboratoire Bordelais de Recherche en Informatique [LaBRI]
hal.structure.identifierEfficient runtime systems for parallel architectures [RUNTIME]
dc.contributor.authorTHIBAULT, Samuel
hal.structure.identifierLaboratoire Bordelais de Recherche en Informatique [LaBRI]
hal.structure.identifierEfficient runtime systems for parallel architectures [RUNTIME]
dc.contributor.authorWACRENIER, Pierre-André
hal.structure.identifierLaboratoire Bordelais de Recherche en Informatique [LaBRI]
hal.structure.identifierEfficient runtime systems for parallel architectures [RUNTIME]
dc.contributor.authorNAMYST, Raymond
dc.contributor.editorIEEE
dc.date.accessioned2024-04-15T09:49:46Z
dc.date.available2024-04-15T09:49:46Z
dc.date.issued2010-04
dc.date.conference2010-04-19
dc.identifier.urihttps://oskar-bordeaux.fr/handle/20.500.12278/198285
dc.description.abstractEnThe now commonplace multi-core chips have introduced, by design, a deep hierarchy of memory and cache banks within parallel computers as a tradeoff between the user friendliness of shared memory on the one side, and memory access scalability and efficiency on the other side. However, to get high performance out of such machines requires a dynamic mapping of application tasks and data onto the underlying architecture. Moreover, depending on the application behavior, this mapping should favor cache affinity, memory bandwidth, computation synchrony, or a combination of these. The great challenge is then to perform this hardware-dependent mapping in a portable, abstract way. To meet this need, we propose a new, hierarchical approach to the execution of OpenMP threads onto multicore machines. Our ForestGOMP runtime system dynamically generates structured trees out of OpenMP programs. It collects relationship information about threads and data as well. This information is used together with scheduling hints and hardware counter feedback by the scheduler to select the most appropriate threads and data distribution. ForestGOMP features a high-level platform for developing and tuning portable threads schedulers. We present several applications for which we developed specific scheduling policies that achieve excellent speedups on 16-core machines.
dc.language.isoen
dc.title.enStructuring the execution of OpenMP applications for multicore architectures
dc.typeCommunication dans un congrès
dc.identifier.doi10.1109/IPDPS.2010.5470442
dc.subject.halInformatique [cs]/Système d'exploitation [cs.OS]
bordeaux.hal.laboratoriesLaboratoire Bordelais de Recherche en Informatique (LaBRI) - UMR 5800*
bordeaux.institutionUniversité de Bordeaux
bordeaux.institutionBordeaux INP
bordeaux.institutionCNRS
bordeaux.conference.titleInternational Parallel and Distributed Symposium (IPDPS 2010)
bordeaux.countryUS
bordeaux.conference.cityAtltanta
bordeaux.peerReviewedoui
hal.identifierinria-00441472
hal.version1
hal.invitednon
hal.proceedingsoui
hal.conference.end2010-04-23
hal.popularnon
hal.audienceInternationale
hal.origin.linkhttps://hal.archives-ouvertes.fr//inria-00441472v1
bordeaux.COinSctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.date=2010-04&rft.au=BROQUEDIS,%20Fran%C3%A7ois&AUMAGE,%20Olivier&GOGLIN,%20Brice&THIBAULT,%20Samuel&WACRENIER,%20Pierre-Andr%C3%A9&rft.genre=unknown


Files in this item

FilesSizeFormatView

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record