Show simple item record

hal.structure.identifierHigh-End Parallel Algorithms for Challenging Numerical Simulations [HiePACS]
dc.contributor.authorAGULLO, Emmanuel
hal.structure.identifierReformulations based algorithms for Combinatorial Optimization [Realopt]
dc.contributor.authorBEAUMONT, Olivier
hal.structure.identifierReformulations based algorithms for Combinatorial Optimization [Realopt]
dc.contributor.authorEYRAUD-DUBOIS, Lionel
hal.structure.identifierSTatic Optimizations, Runtime Methods [STORM]
dc.contributor.authorKUMAR, Suraj
dc.date.created2015-10
dc.date.issued2016-05
dc.date.conference2016-05
dc.description.abstractEnOur goal is to provide an analysis and comparison of static and dynamic strategies for task graph scheduling on platforms consisting of heterogeneous and unrelated resources , such as GPUs and CPUs. Static scheduling strategies, that have been used for years, suffer several weaknesses. First, it is well known that underlying optimization problems are NP-Complete, what limits the capability of finding optimal solutions to small cases. Second, parallelism inside processing nodes makes it difficult to precisely predict the performance of both communications and computations, due to shared resources and co-scheduling effects. Recently, to cope with this limitations, many dynamic task-graph based runtime schedulers (StarPU, StarSs, QUARK, PaRSEC) have been proposed. Dynamic schedulers base their allocation and scheduling decisions on the one side on dynamic information such as the set of available tasks, the location of data and the state of the resources and on the other hand on static information such as task priorities computed from the whole task graph. Our analysis is deep but we concentrate on a single kernel, namely Cholesky factorization of dense matrices on platforms consisting of GPUs and CPUs. This application encompasses many important characteristics in our context. Indeed, it involves 4 different kernels (POTRF, TRSM, SYRK and GEMM) whose acceleration ratios on GPUs are strongly different (from 2.3 for POTRF to 29 for GEMM) and it consists in a phase where the number of available tasks if large, where the careful use of resources is critical, and in a phase with few tasks available, where the choice of the task to be executed is crucial. In this paper, we analyze the performance of static and dynamic strategies and we propose a set of intermediate strategies, by adding more static (resp. dynamic) features into dynamic (resp. static) strategies. Our conclusions are somehow unexpected in the sense that we prove that static-based strategies are very efficient, even in a context where performance estimations are not very good.
dc.description.sponsorshipSolveurs pour architectures hétérogènes utilisant des supports d'exécution - ANR-13-MONU-0007
dc.language.isoen
dc.publisherIEEE
dc.subject.enCholesky
dc.subject.enAccelerators
dc.subject.enHeterogeneous Systems
dc.subject.enRuntime Systems
dc.subject.enScheduling
dc.subject.enUnrelated Machines
dc.title.enAre Static Schedules so Bad ? A Case Study on Cholesky Factorization
dc.typeCommunication dans un congrès
dc.subject.halInformatique [cs]/Calcul parallèle, distribué et partagé [cs.DC]
bordeaux.conference.titleIEEE International Parallel & Distributed Processing Symposium (IPDPS 2016)
bordeaux.countryUS
bordeaux.conference.cityChicago, IL
bordeaux.peerReviewedoui
hal.identifierhal-01223573
hal.version1
hal.invitednon
hal.proceedingsoui
hal.conference.end2016-05
hal.popularnon
hal.audienceInternationale
hal.origin.linkhttps://hal.archives-ouvertes.fr//hal-01223573v1
bordeaux.COinSctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.date=2016-05&rft.au=AGULLO,%20Emmanuel&BEAUMONT,%20Olivier&EYRAUD-DUBOIS,%20Lionel&KUMAR,%20Suraj&rft.genre=unknown


Files in this item

FilesSizeFormatView

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record