Show simple item record

hal.structure.identifierComputer Science Department [CST]
dc.contributor.authorTRAHAY, François
hal.structure.identifierLaboratoire Bordelais de Recherche en Informatique [LaBRI]
hal.structure.identifierEfficient runtime systems for parallel architectures [RUNTIME]
dc.contributor.authorDENIS, Alexandre
hal.structure.identifierComputer Science Department [CST]
dc.contributor.authorISHIKAWA, Yutaka
dc.date.accessioned2024-04-15T09:43:48Z
dc.date.available2024-04-15T09:43:48Z
dc.date.issued2010-12-10
dc.identifier.urihttps://oskar-bordeaux.fr/handle/20.500.12278/197791
dc.description.abstractEnWith the increase of the number of nodes in clusters, the probability of failures increases. In this paper, we study the failures in the network stack for high performance networks. We present the design of several fault-tolerance mechanisms for communication libraries to detect failures and to ensure message integrity. We have implemented these mechanisms in the N EW M ADELEINE communication library with a quick detection of failures in a portable way, and with fallback to available links when an error occurs. Our mechanisms ensure the integrity of messages without lowering too much the networking performance. Our evaluation show that ensuring fault-tolerance does not impact significantly the performance of most applications.
dc.language.isoen
dc.subject.enNewMadeleine
dc.subject.enMPI
dc.subject.enMadMPI
dc.subject.enpioman
dc.title.enA Generic and High Performance Approach for Fault Tolerance in Communication Library
dc.typeRapport
dc.subject.halInformatique [cs]/Réseaux et télécommunications [cs.NI]
bordeaux.hal.laboratoriesLaboratoire Bordelais de Recherche en Informatique (LaBRI) - UMR 5800*
bordeaux.institutionUniversité de Bordeaux
bordeaux.institutionBordeaux INP
bordeaux.institutionCNRS
bordeaux.type.institutionINRIA Bordeaux
bordeaux.type.reportrr
hal.identifierhal-00793176
hal.version1
hal.origin.linkhttps://hal.archives-ouvertes.fr//hal-00793176v1
bordeaux.COinSctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.date=2010-12-10&rft.au=TRAHAY,%20Fran%C3%A7ois&DENIS,%20Alexandre&ISHIKAWA,%20Yutaka&rft.genre=unknown


Files in this item

FilesSizeFormatView

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record