Afficher la notice abrégée

hal.structure.identifierUniversité de Bordeaux [UB]
hal.structure.identifierInstitut de Mathématiques de Bordeaux [IMB]
hal.structure.identifierQuality control and dynamic reliability [CQFD]
dc.contributor.authorCHAVENT, Marie
hal.structure.identifierRetraité
dc.contributor.authorCHAVENT, Guy
dc.date.accessioned2024-04-04T02:47:31Z
dc.date.available2024-04-04T02:47:31Z
dc.identifier.urihttps://oskar-bordeaux.fr/handle/20.500.12278/191651
dc.description.abstractEnWe address the problem of defining a group sparse formulation for Principal Components Analysis (PCA) - or its equivalent formulations as Low Rank approximation or Dictionary Learning problems - which achieves a compromise between maximizing the variance explained by the components and promoting sparsity of the loadings. So we propose first a new definition of the variance explained by non necessarily orthogonal components, which is optimal in some aspect and compatible with the principal components situation. Then we use a specific regularization of this variance by the group-L1 norm to define a Group Sparse Maximum Variance (GSMV) formulation of PCA. The GSMV formulation achieves our objective by construction, and has the nice property that the inner non smooth optimization problem can be solved analytically, thus reducing GSMV to the maximization of a smooth and convex function under unit norm and orthogonality constraints, which generalizes Journee et al. (2010) to group sparsity. Numerical comparison with deflation on synthetic data shows that GSMV produces steadily slightly better and more robust results for the retrieval of hidden sparse structures, and is about three times faster on these examples. Application to real data shows the interest of group sparsity for variables selection in PCA of mixed data (categorical/numerical) .
dc.language.isoen
dc.subject.enPCA
dc.subject.enSparsity
dc.subject.enDimension reduction
dc.subject.enVariance
dc.subject.enMixed data
dc.subject.enOrthogonal constraints
dc.subject.enBlock optimization
dc.title.enOptimal Projected Variance Group-Sparse Block PCA
dc.typeDocument de travail - Pré-publication
dc.subject.halStatistiques [stat]/Autres [stat.ML]
dc.identifier.arxiv1705.00461
bordeaux.hal.laboratoriesInstitut de Mathématiques de Bordeaux (IMB) - UMR 5251*
bordeaux.institutionUniversité de Bordeaux
bordeaux.institutionBordeaux INP
bordeaux.institutionCNRS
hal.identifierhal-03125264
hal.version1
hal.origin.linkhttps://hal.archives-ouvertes.fr//hal-03125264v1
bordeaux.COinSctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.au=CHAVENT,%20Marie&CHAVENT,%20Guy&rft.genre=preprint


Fichier(s) constituant ce document

FichiersTailleFormatVue

Il n'y a pas de fichiers associés à ce document.

Ce document figure dans la(les) collection(s) suivante(s)

Afficher la notice abrégée