Sparse k-means for mixed data via group-sparse clustering
MOURER, Alex
Statistique, Analyse et Modélisation Multidisciplinaire (SAmos-Marin Mersenne) [SAMM]
Safran Aircraft Engines
Quality control and dynamic reliability [CQFD]
Voir plus >
Statistique, Analyse et Modélisation Multidisciplinaire (SAmos-Marin Mersenne) [SAMM]
Safran Aircraft Engines
Quality control and dynamic reliability [CQFD]
MOURER, Alex
Statistique, Analyse et Modélisation Multidisciplinaire (SAmos-Marin Mersenne) [SAMM]
Safran Aircraft Engines
Quality control and dynamic reliability [CQFD]
< Réduire
Statistique, Analyse et Modélisation Multidisciplinaire (SAmos-Marin Mersenne) [SAMM]
Safran Aircraft Engines
Quality control and dynamic reliability [CQFD]
Langue
en
Communication dans un congrès
Ce document a été publié dans
ESANN 2020 - 28th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, 2020-10-02, Bruges / Virtual. 2020-10-02, vol. 978-2-87587-074-2
Résumé en anglais
The present manuscript tackles the issue of variable selection for clustering, in high dimensional data described both by numerical and categorical features. First, we build upon the sparse k-means algorithm with lasso ...Lire la suite >
The present manuscript tackles the issue of variable selection for clustering, in high dimensional data described both by numerical and categorical features. First, we build upon the sparse k-means algorithm with lasso penalty, and introduce the group-L1 penalty-already known in regression-in the unsupervised context. Second, we preprocess mixed data and transform categorical features into groups of dummy variables with appropriate scaling, on which one may then apply the group-sparse clustering procedure. The proposed method performs simultaneously clustering and feature selection, and provides meaningful partitions and meaningful features, numerical and categorical, for describing them.< Réduire
Mots clés en anglais
Clustering
Kmeans algorithm
Variables selection
Sparse Models
Lasso penalty
Group lasso
Interpretability
Explainability
Weighted Kmeans
Origine
Importé de halUnités de recherche