AGULLO, Emmanuel; AUGONNET, Cédric; DONGARRA, Jack; LTAIEF, Hatem; NAMYST, Raymond; ROMAN, Jean; THIBAULT, Samuel; TOMOV, Stanimire

The system will be going down for regular maintenance. Please save your work and logout.

hal.structure.identifier	High-End Parallel Algorithms for Challenging Numerical Simulations [HiePACS]
hal.structure.identifier	Laboratoire Bordelais de Recherche en Informatique [LaBRI]
dc.contributor.author	AGULLO, Emmanuel
hal.structure.identifier	Laboratoire Bordelais de Recherche en Informatique [LaBRI]
hal.structure.identifier	Efficient runtime systems for parallel architectures [RUNTIME]
dc.contributor.author	AUGONNET, Cédric
hal.structure.identifier	Department of Computer Science. University of Tennessee
hal.structure.identifier	Oak Ridge National Laboratory [Oak Ridge] [ORNL]
hal.structure.identifier	School of Computer Science [Manchester]
dc.contributor.author	DONGARRA, Jack
hal.structure.identifier	Department of Computer Science. University of Tennessee
dc.contributor.author	LTAIEF, Hatem
hal.structure.identifier	Laboratoire Bordelais de Recherche en Informatique [LaBRI]
hal.structure.identifier	Efficient runtime systems for parallel architectures [RUNTIME]
dc.contributor.author	NAMYST, Raymond
hal.structure.identifier	High-End Parallel Algorithms for Challenging Numerical Simulations [HiePACS]
hal.structure.identifier	Laboratoire Bordelais de Recherche en Informatique [LaBRI]
dc.contributor.author	ROMAN, Jean
hal.structure.identifier	Laboratoire Bordelais de Recherche en Informatique [LaBRI]
hal.structure.identifier	Efficient runtime systems for parallel architectures [RUNTIME]
dc.contributor.author	THIBAULT, Samuel
hal.structure.identifier	Department of Computer Science. University of Tennessee
dc.contributor.author	TOMOV, Stanimire
dc.date.accessioned	2024-04-15T09:48:12Z
dc.date.available	2024-04-15T09:48:12Z
dc.date.issued	2010-07
dc.date.conference	2010-07-13
dc.identifier.uri	https://oskar-bordeaux.fr/handle/20.500.12278/198162
dc.description.abstractEn	Although the hardware has dramatically changed in the last few years, nodes of multicore chips augmented by Graphics Processing Units (GPUs) seem to be a trend of major importance. Previous approaches for scheduling dense linear operations on such a complex node led to high performance but at the double cost of not using the potential of all the cores and producing a static and non generic code. In this extended abstract, we present a new approach for scheduling dense linear algebra operations on multicore architectures with GPU accelerators using a dynamic scheduler capable of using the full potential of the node [1]. We underline the benefits both in terms of programmability and performance. We illustrate our approach with a Cholesky factorization relying on cutting edge GPU and CPU kernels [2], [3] achieving roughly 900 Gflop/s on an eight cores node accelerated with three NVIDIA Tesla GPUs.
dc.language.iso	en
dc.title.en	Dynamically scheduled Cholesky factorization on multicore architectures with GPU accelerators.
dc.type	Communication dans un congrès
dc.subject.hal	Informatique [cs]/Calcul parallèle, distribué et partagé [cs.DC]
bordeaux.hal.laboratories	Laboratoire Bordelais de Recherche en Informatique (LaBRI) - UMR 5800	*
bordeaux.institution	Université de Bordeaux
bordeaux.institution	Bordeaux INP
bordeaux.institution	CNRS
bordeaux.conference.title	Symposium on Application Accelerators in High Performance Computing (SAAHPC)
bordeaux.country	US
bordeaux.conference.city	Knoxville
bordeaux.peerReviewed	oui
hal.identifier	inria-00547616
hal.version	1
hal.invited	non
hal.proceedings	oui
hal.conference.end	2010-07-15
hal.popular	non
hal.audience	Internationale
hal.origin.link	https://hal.archives-ouvertes.fr//inria-00547616v1
bordeaux.COinS	ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.date=2010-07&rft.au=AGULLO,%20Emmanuel&AUGONNET,%20C%C3%A9dric&DONGARRA,%20Jack&LTAIEF,%20Hatem&NAMYST,%20Raymond&rft.genre=unknown

Files in this item

Files	Size	Format	View
There are no files associated with this item.

This item appears in the following Collection(s)

Laboratoire Bordelais de Recherche en Informatique (LaBRI) - UMR 5800

Show simple item record

Dynamically scheduled Cholesky factorization on multicore architectures with GPU accelerators.

Files in this item

This item appears in the following Collection(s)