PUTIGNY, Bertrand; RUELLE, Benoit; GOGLIN, Brice

doi:10.1109/IPDPSW.2014.139

hal.structure.identifier	Efficient runtime systems for parallel architectures [RUNTIME]
hal.structure.identifier	Laboratoire Bordelais de Recherche en Informatique [LaBRI]
dc.contributor.author	PUTIGNY, Bertrand
hal.structure.identifier	Efficient runtime systems for parallel architectures [RUNTIME]
dc.contributor.author	RUELLE, Benoit
hal.structure.identifier	Efficient runtime systems for parallel architectures [RUNTIME]
hal.structure.identifier	Laboratoire Bordelais de Recherche en Informatique [LaBRI]
dc.contributor.author	GOGLIN, Brice
dc.date.accessioned	2024-04-15T09:41:39Z
dc.date.available	2024-04-15T09:41:39Z
dc.date.created	2013-11-01
dc.date.issued	2014-05
dc.date.conference	2014-05-23
dc.identifier.uri	https://oskar-bordeaux.fr/handle/20.500.12278/197617
dc.description.abstractEn	Shared memory MPI communication is an important part of the overall performance of parallel applications. However understanding the behavior of these data transfers is difficult because of the combined complexity of modern memory architectures with multiple levels of caches and complex cache coherence protocols, of MPI implementations, and of application needs. We analyze shared memory MPI communication from a cache coherence perspective through a new memory model. It captures the memory architecture characteristics with microbenchmarks that exhibit the limitations of the memory accesses involved in the data transfer. We model the performance of intra-node communication without requiring complex analytical models. The advantage of the approach consists in not requiring deep knowledge of rarely documented hardware features such as caching policies or prefetchers that make modeling modern memory subsystems hardly feasible. Our qualitative analysis based on this result leads to a better understanding of shared memory communication performance for scientific computing. We then discuss some possible optimizations such as buffer reuse order, cache flushing, and non-temporal instructions that could be used by MPI implementers.
dc.language.iso	en
dc.publisher	IEEE
dc.title.en	Analysis of MPI Shared-Memory Communication Performance from a Cache Coherence Perspective
dc.type	Communication dans un congrès
dc.identifier.doi	10.1109/IPDPSW.2014.139
dc.subject.hal	Informatique [cs]/Système d'exploitation [cs.OS]
bordeaux.hal.laboratories	Laboratoire Bordelais de Recherche en Informatique (LaBRI) - UMR 5800	*
bordeaux.institution	Université de Bordeaux
bordeaux.institution	Bordeaux INP
bordeaux.institution	CNRS
bordeaux.conference.title	PDSEC - The 15th IEEE International Workshop on Parallel and Distributed Scientific and Engineering Computing, held in conjunction with IPDPS
bordeaux.country	US
bordeaux.conference.city	Phoenix, AZ
bordeaux.peerReviewed	oui
hal.identifier	hal-00956307
hal.version	1
hal.invited	non
hal.proceedings	oui
hal.popular	non
hal.audience	Internationale
hal.origin.link	https://hal.archives-ouvertes.fr//hal-00956307v1
bordeaux.COinS	ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.date=2014-05&rft.au=PUTIGNY,%20Bertrand&RUELLE,%20Benoit&GOGLIN,%20Brice&rft.genre=unknown

Archivos en el ítem

Archivos	Tamaño	Formato	Ver
No hay archivos asociados a este ítem.

Este ítem aparece en la(s) siguiente(s) colección(ones)

Laboratoire Bordelais de Recherche en Informatique (LaBRI) - UMR 5800

Mostrar el registro sencillo del ítem

Analysis of MPI Shared-Memory Communication Performance from a Cache Coherence Perspective

Archivos en el ítem

Este ítem aparece en la(s) siguiente(s) colección(ones)