Show simple item record

hal.structure.identifierReformulations based algorithms for Combinatorial Optimization [Realopt]
dc.contributor.authorBEAUMONT, Olivier
hal.structure.identifierReformulations based algorithms for Combinatorial Optimization [Realopt]
dc.contributor.authorLAMBERT, Thomas
hal.structure.identifierOptimisation des ressources : modèles, algorithmes et ordonnancement [ROMA]
hal.structure.identifierLaboratoire de l'Informatique du Parallélisme [LIP]
dc.contributor.authorMARCHAL, Loris
hal.structure.identifierÉcole normale supérieure - Rennes [ENS Rennes]
dc.contributor.authorTHOMAS, Bastien
dc.date.accessioned2024-04-04T03:08:23Z
dc.date.available2024-04-04T03:08:23Z
dc.date.created2017-02
dc.date.issued2017-02
dc.identifier.urihttps://oskar-bordeaux.fr/handle/20.500.12278/193525
dc.description.abstractEnMapReduce is a well-know framework for distributing data-processingcomputations onto parallel clusters. In MapReduce, a large computationis broken into small tasks that run in parallel on multiple machines,and scales easily to very large clusters of inexpensive commoditycomputers. Before the Map phase, the original dataset is split intodata chunks that are replicated (a constant number of times, usually3) and distributed randomly onto computing nodes. During the Mapphase, local tasks (i.e., tasks whose data chunks are stored locally)are assigned in priority when processors request tasks. In thispaper, we provide the first complete theoretical analysis of datalocality in the Map phase of MapReduce, and more generally, forbag-of-tasks applications that behave like MapReduce. We prove that iftasks are homogeneous (in terms of processing time), as soon as thereplication factor is larger than 2, FindAssignment, a matching basedalgorithm, achieves a quasi-perfect makespan (i.e., optimal up to anadditive constant of one step) using a sophisticated matchingalgorithm. Above result is proved with high probability when thenumber of tasks becomes arbitrarily large, and we therefore complementtheoretical results with simulations that corroborate them even forsmall number of tasks. We also show that the matching-based approachleads to an improvement of data locality during the Map phase andtherefore decreases the amount of communications needed to achieveperfect makespan, compared to the classical MapReduce greedy approach.
dc.description.sponsorshipSolveurs pour architectures hétérogènes utilisant des supports d'exécution - ANR-13-MONU-0007
dc.language.isoen
dc.subject.enResource Allocation and Scheduling
dc.subject.enAnalysis of Randomized Algorithms
dc.subject.enMatchings
dc.subject.enMapReduce
dc.subject.enBalls-into-bins
dc.title.enMatching-Based Assignement Strategies for Improving Data Locality of Map Tasks in MapReduce
dc.typeRapport
dc.subject.halInformatique [cs]/Calcul parallèle, distribué et partagé [cs.DC]
bordeaux.hal.laboratoriesInstitut de Mathématiques de Bordeaux (IMB) - UMR 5251*
bordeaux.institutionUniversité de Bordeaux
bordeaux.institutionBordeaux INP
bordeaux.institutionCNRS
bordeaux.type.institutionInria - Research Centre Grenoble – Rhône-Alpes
bordeaux.type.institutionInria Bordeaux Sud-Ouest
bordeaux.type.reportrr
hal.identifierhal-01386539
hal.version1
hal.origin.linkhttps://hal.archives-ouvertes.fr//hal-01386539v1
bordeaux.COinSctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.date=2017-02&rft.au=BEAUMONT,%20Olivier&LAMBERT,%20Thomas&MARCHAL,%20Loris&THOMAS,%20Bastien&rft.genre=unknown


Files in this item

FilesSizeFormatView

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record