CHAVENT, Marie; LACAILLE, Jérôme; MOURER, Alex; OLTEANU, Madalina

El sistema se apagará debido a tareas habituales de mantenimiento. Por favor, guarde su trabajo y desconéctese.

hal.structure.identifier	Université de Bordeaux [UB]
hal.structure.identifier	Méthodes avancées d’apprentissage statistique et de contrôle [ASTRAL]
dc.contributor.author	CHAVENT, Marie
hal.structure.identifier	Safran Aircraft Engines
dc.contributor.author	LACAILLE, Jérôme
hal.structure.identifier	Statistique, Analyse et Modélisation Multidisciplinaire (SAmos-Marin Mersenne) [SAMM]
dc.contributor.author	MOURER, Alex
hal.structure.identifier	CEntre de REcherches en MAthématiques de la DEcision [CEREMADE]
dc.contributor.author	OLTEANU, Madalina
dc.date.accessioned	2024-04-04T02:39:47Z
dc.date.available	2024-04-04T02:39:47Z
dc.date.conference	2022-06-13
dc.identifier.uri	https://oskar-bordeaux.fr/handle/20.500.12278/191002
dc.description.abstractEn	High-dimensional data may often contain both numerical and categorical features, and in some cases features may be available as natural groups (repeated measurements, categories of features, ...). Clustering this kind of data raises several issues: how to simultaneously deal with numerical and categorical features? how to build meaningful clusters of the input entities? how to select the most informative features or groups of features for the clustering? In the k-means framework, one may rely on a penalised version of the between-cluster variance, and find both the best partitioning of the data, and the most informative features or groups of features. The present manuscript illustrates sparse k-means and group sparse k-means for mixed data, using the vimpclust package. The example provided on a small real-life dataset shows how feature selection may be directly combined with clustering, and provide a meaningful selection while preserving the quality of the clustering.
dc.language.iso	en
dc.subject	clustering
dc.subject	k-means parcimonieux
dc.subject	pénalités L1 et L1-groupe
dc.subject	données mixtes
dc.subject	packages R
dc.subject.en	clustering
dc.subject.en	sparse k-means
dc.subject.en	L1 and group-L1 penalties
dc.subject.en	mixed data
dc.subject.en	R packages
dc.title.en	Sparse and group-sparse clustering for mixed data An illustration of the vimpclust package
dc.type	Communication dans un congrès
dc.subject.hal	Mathématiques [math]/Statistiques [math.ST]
bordeaux.hal.laboratories	Institut de Mathématiques de Bordeaux (IMB) - UMR 5251	*
bordeaux.institution	Université de Bordeaux
bordeaux.institution	Bordeaux INP
bordeaux.institution	CNRS
bordeaux.conference.title	JDS 2022 - 53èmes Journées de Statistique de la Société Française de Statistique (SFdS)
bordeaux.country	FR
bordeaux.conference.city	Lyon
bordeaux.peerReviewed	oui
hal.identifier	hal-03839521
hal.version	1
hal.invited	non
hal.proceedings	non
hal.conference.end	2022-06-17
hal.popular	non
hal.audience	Internationale
hal.origin.link	https://hal.archives-ouvertes.fr//hal-03839521v1
bordeaux.COinS	ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.au=CHAVENT,%20Marie&LACAILLE,%20J%C3%A9r%C3%B4me&MOURER,%20Alex&OLTEANU,%20Madalina&rft.genre=unknown

Archivos en el ítem

Archivos	Tamaño	Formato	Ver
No hay archivos asociados a este ítem.

Este ítem aparece en la(s) siguiente(s) colección(ones)

Institut de Mathématiques de Bordeaux (IMB) - UMR 5251

Mostrar el registro sencillo del ítem

Sparse and group-sparse clustering for mixed data An illustration of the vimpclust package

Archivos en el ítem

Este ítem aparece en la(s) siguiente(s) colección(ones)