Show simple item record

hal.structure.identifierUniversité de Bordeaux [UB]
hal.structure.identifierMéthodes avancées d’apprentissage statistique et de contrôle [ASTRAL]
dc.contributor.authorCHAVENT, Marie
hal.structure.identifierSafran Aircraft Engines
dc.contributor.authorLACAILLE, Jérôme
hal.structure.identifierStatistique, Analyse et Modélisation Multidisciplinaire (SAmos-Marin Mersenne) [SAMM]
dc.contributor.authorMOURER, Alex
hal.structure.identifierCEntre de REcherches en MAthématiques de la DEcision [CEREMADE]
dc.contributor.authorOLTEANU, Madalina
dc.date.accessioned2024-04-04T02:39:47Z
dc.date.available2024-04-04T02:39:47Z
dc.date.conference2022-06-13
dc.identifier.urihttps://oskar-bordeaux.fr/handle/20.500.12278/191002
dc.description.abstractEnHigh-dimensional data may often contain both numerical and categorical features, and in some cases features may be available as natural groups (repeated measurements, categories of features, ...). Clustering this kind of data raises several issues: how to simultaneously deal with numerical and categorical features? how to build meaningful clusters of the input entities? how to select the most informative features or groups of features for the clustering? In the k-means framework, one may rely on a penalised version of the between-cluster variance, and find both the best partitioning of the data, and the most informative features or groups of features. The present manuscript illustrates sparse k-means and group sparse k-means for mixed data, using the vimpclust package. The example provided on a small real-life dataset shows how feature selection may be directly combined with clustering, and provide a meaningful selection while preserving the quality of the clustering.
dc.language.isoen
dc.subjectclustering
dc.subjectk-means parcimonieux
dc.subjectpénalités L1 et L1-groupe
dc.subjectdonnées mixtes
dc.subjectpackages R
dc.subject.enclustering
dc.subject.ensparse k-means
dc.subject.enL1 and group-L1 penalties
dc.subject.enmixed data
dc.subject.enR packages
dc.title.enSparse and group-sparse clustering for mixed data An illustration of the vimpclust package
dc.typeCommunication dans un congrès
dc.subject.halMathématiques [math]/Statistiques [math.ST]
bordeaux.hal.laboratoriesInstitut de Mathématiques de Bordeaux (IMB) - UMR 5251*
bordeaux.institutionUniversité de Bordeaux
bordeaux.institutionBordeaux INP
bordeaux.institutionCNRS
bordeaux.conference.titleJDS 2022 - 53èmes Journées de Statistique de la Société Française de Statistique (SFdS)
bordeaux.countryFR
bordeaux.conference.cityLyon
bordeaux.peerReviewedoui
hal.identifierhal-03839521
hal.version1
hal.invitednon
hal.proceedingsnon
hal.conference.end2022-06-17
hal.popularnon
hal.audienceInternationale
hal.origin.linkhttps://hal.archives-ouvertes.fr//hal-03839521v1
bordeaux.COinSctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.au=CHAVENT,%20Marie&LACAILLE,%20J%C3%A9r%C3%B4me&MOURER,%20Alex&OLTEANU,%20Madalina&rft.genre=unknown


Files in this item

FilesSizeFormatView

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record