Afficher la notice abrégée

hal.structure.identifierCOmposabilité Numerique and parallèle pour le CAlcul haute performanCE [CONCACE]
dc.contributor.authorAGULLO, Emmanuel
hal.structure.identifierAlgorithmes Parallèles et Optimisation [IRIT-APO]
hal.structure.identifierCentre National de la Recherche Scientifique [CNRS]
dc.contributor.authorBUTTARI, Alfredo
hal.structure.identifierCOmposabilité Numerique and parallèle pour le CAlcul haute performanCE [CONCACE]
dc.contributor.authorCOULAUD, Olivier
hal.structure.identifierLaboratoire Bordelais de Recherche en Informatique [LaBRI]
hal.structure.identifierOutils et Optimisations pour le Calcul Haute Performance et l'Apprentissage [TOPAL]
dc.contributor.authorEYRAUD-DUBOIS, Lionel
hal.structure.identifierOutils et Optimisations pour le Calcul Haute Performance et l'Apprentissage [TOPAL]
dc.contributor.authorFAVERGE, Mathieu
hal.structure.identifierBiodiversité, Gènes & Communautés [BioGeCo]
hal.structure.identifierPleiade, from patterns to models in computational biodiversity and biotechnology [PLEIADE]
dc.contributor.authorFRANC, Alain
hal.structure.identifierOutils et Optimisations pour le Calcul Haute Performance et l'Apprentissage [TOPAL]
hal.structure.identifierLaboratoire Bordelais de Recherche en Informatique [LaBRI]
dc.contributor.authorGUERMOUCHE, Abdou
hal.structure.identifierAlgorithmes Parallèles et Optimisation [IRIT-APO]
hal.structure.identifierInstitut National Polytechnique (Toulouse) [Toulouse INP]
dc.contributor.authorJEGO, Antoine
hal.structure.identifierCOmposabilité Numerique and parallèle pour le CAlcul haute performanCE [CONCACE]
dc.contributor.authorPERESSONI, Romain
hal.structure.identifierService Expérimentation et Développement [Bordeaux] [SED]
dc.contributor.authorPRUVOST, Florent
dc.contributor.editorIEEE
dc.date.created2023
dc.date.issued2023-06
dc.date.conference2023-05-15
dc.description.abstractEnDense matrix multiplication involving a symmetric input matrix (SYMM) is implemented in reference distributed-memory codes with the same data distribution as its general analogue (GEMM). We show that, when the symmetric matrix is dominant, such a 2D block-cyclic (2D BC) scheme leads to a lower arithmetic intensity (AI) of SYMM than that of GEMM by a factor of 2. We propose alternative data distributions preserving the memory benefit of SYMM of storing only half of the matrix while achieving up to the same AI as GEMM. We also show that, in the case we can afford the same memory footprint as GEMM, SYMM can achieve a higher AI. We propose a task-based design of SYMM independent of the data distribution. This design allows for scalable A-stationary SYMM with which all discussed data distributions, may they be very irregular, can be easily assessed. We have integrated the resulting code in a reduction dimension algorithm involving a randomized singular value decomposition dominated by SYMM. An experimental study shows a compelling impact on performance.
dc.description.sponsorshipSolveurs pour architectures hétérogènes utilisant des supports d'exécution, objectif scalabilité - ANR-19-CE46-0009
dc.language.isoen
dc.rights.urihttp://creativecommons.org/licenses/by/
dc.source.titleInternational Parallel and Distributed Processing Symposium
dc.subject.enMatrix multiplication
dc.subject.enSYMM
dc.subject.enGEMM
dc.subject.en2DBC
dc.subject.entask-based programming
dc.subject.enSymmetric
dc.subject.enSBC
dc.subject.enTBC
dc.subject.en3D
dc.subject.en2.5D
dc.title.enOn the Arithmetic Intensity of Distributed-Memory Dense Matrix Multiplication Involving a Symmetric Input Matrix (SYMM)
dc.typeCommunication dans un congrès
dc.subject.halInformatique [cs]/Calcul parallèle, distribué et partagé [cs.DC]
dc.subject.halInformatique [cs]/Bio-informatique [q-bio.QM]
bordeaux.page357-367
bordeaux.conference.titleIPDPS 2023 - 37th International Parallel and Distributed Processing Symposium
bordeaux.countryUS
bordeaux.title.proceedingInternational Parallel and Distributed Processing Symposium
bordeaux.conference.citySt. Petersburg, FL
bordeaux.peerReviewedoui
hal.identifierhal-04093162
hal.version1
hal.invitednon
hal.proceedingsoui
hal.conference.organizerIEEE
hal.conference.end2023-05-19
hal.popularnon
hal.audienceInternationale
hal.origin.linkhttps://hal.archives-ouvertes.fr//hal-04093162v1
bordeaux.COinSctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.btitle=International%20Parallel%20and%20Distributed%20Processing%20Symposium&rft.date=2023-06&rft.spage=357-367&rft.epage=357-367&rft.au=AGULLO,%20Emmanuel&BUTTARI,%20Alfredo&COULAUD,%20Olivier&EYRAUD-DUBOIS,%20Lionel&FAVERGE,%20Mathieu&rft.genre=unknown


Fichier(s) constituant ce document

FichiersTailleFormatVue

Il n'y a pas de fichiers associés à ce document.

Ce document figure dans la(les) collection(s) suivante(s)

Afficher la notice abrégée