Show simple item record

hal.structure.identifierBiodiversité, Gènes & Communautés [BioGeCo]
hal.structure.identifierPleiade, from patterns to models in computational biodiversity and biotechnology [PLEIADE]
dc.contributor.authorABOUABDALLAH, Mohamed
hal.structure.identifierCOmposabilité Numerique and parallèle pour le CAlcul haute performanCE [CONCACE]
dc.contributor.authorCOULAUD, Olivier
hal.structure.identifierUnité de Mathématiques et Informatique Appliquées de Toulouse [MIAT INRAE]
dc.contributor.authorPEYRARD, Nathalie
hal.structure.identifierBiodiversité, Gènes & Communautés [BioGeCo]
hal.structure.identifierPleiade, from patterns to models in computational biodiversity and biotechnology [PLEIADE]
dc.contributor.authorFRANC, Alain
dc.date.accessioned2024-04-11T08:06:34Z
dc.date.available2024-04-11T08:06:34Z
dc.identifier.urihttps://oskar-bordeaux.fr/handle/20.500.12278/197486
dc.description.abstractEnThe Weighted Stochastic Block Model (WSBM) is a statistical model for unsupervised clustering of individuals based on a pairwise distance matrix. The probabilities of group membership are computed as unary marginals of the joint conditional distribution of the WSBM, whose exact evaluation with brute force is out of reach beyond a few individuals. We propose to build an exact Tensor-Train (TT) decomposition of the multivariate joint distribution, from the SVD of each binary factor of a WSBM, which leads to variables separation. We present how to exploit this decomposition to compute unary and binary marginals. They are expressed without approximation as products of matrices involved in the TT decomposition. However, the implementation of the procedure faces several numerical challenges. First, the dimensions of the matrices involved grow faster than exponentially with the number of variables. We bypass this difficulty by using the format of TT-matrices. Second, the TT-rank of the products grows exponentially. Then, we use a numerical approximation of matrices product that guarantees a low TT-rank, the rounding. We compare the TT approach with two classical inference methods, the Mean-Field approximation and the Gibbs Sampler, on the problem of binary marginal inference for WSBM with 1 various distances structures and up to fifty variables. The results lead to recommend the TT approach for its accuracy and reasonable computing time. Further researches should be devoted to the numerical difficulties for controlling the rank in rounding, to be able to deal with larger problems.
dc.language.isoen
dc.rights.urihttp://creativecommons.org/licenses/by/
dc.subject.enBinary marginals
dc.subject.enWeighted Stochastic Block Model
dc.subject.enVariables separation
dc.subject.enTensor-Train format
dc.subject.enlow rank approximation
dc.subject.enTT matrices
dc.title.enComputing WSBM marginals with Tensor-Train decomposition
dc.typeDocument de travail - Pré-publication
dc.subject.halStatistiques [stat]/Calcul [stat.CO]
bordeaux.hal.laboratoriesBioGeCo (Biodiversité Gènes & Communautés) - UMR 1202*
bordeaux.institutionUniversité de Bordeaux
bordeaux.institutionINRAE
hal.identifierhal-04394024
hal.version1
hal.origin.linkhttps://hal.archives-ouvertes.fr//hal-04394024v1
bordeaux.COinSctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.au=ABOUABDALLAH,%20Mohamed&COULAUD,%20Olivier&PEYRARD,%20Nathalie&FRANC,%20Alain&rft.genre=preprint


Files in this item

FilesSizeFormatView

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record