Sparse k-means for mixed data via group-sparse clustering
MOURER, Alex
Statistique, Analyse et Modélisation Multidisciplinaire (SAmos-Marin Mersenne) [SAMM]
Safran Aircraft Engines
Quality control and dynamic reliability [CQFD]
See more >
Statistique, Analyse et Modélisation Multidisciplinaire (SAmos-Marin Mersenne) [SAMM]
Safran Aircraft Engines
Quality control and dynamic reliability [CQFD]
MOURER, Alex
Statistique, Analyse et Modélisation Multidisciplinaire (SAmos-Marin Mersenne) [SAMM]
Safran Aircraft Engines
Quality control and dynamic reliability [CQFD]
< Reduce
Statistique, Analyse et Modélisation Multidisciplinaire (SAmos-Marin Mersenne) [SAMM]
Safran Aircraft Engines
Quality control and dynamic reliability [CQFD]
Language
en
Communication dans un congrès
This item was published in
ESANN 2020 - 28th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, 2020-10-02, Bruges / Virtual. 2020-10-02, vol. 978-2-87587-074-2
English Abstract
The present manuscript tackles the issue of variable selection for clustering, in high dimensional data described both by numerical and categorical features. First, we build upon the sparse k-means algorithm with lasso ...Read more >
The present manuscript tackles the issue of variable selection for clustering, in high dimensional data described both by numerical and categorical features. First, we build upon the sparse k-means algorithm with lasso penalty, and introduce the group-L1 penalty-already known in regression-in the unsupervised context. Second, we preprocess mixed data and transform categorical features into groups of dummy variables with appropriate scaling, on which one may then apply the group-sparse clustering procedure. The proposed method performs simultaneously clustering and feature selection, and provides meaningful partitions and meaningful features, numerical and categorical, for describing them.Read less <
English Keywords
Clustering
Kmeans algorithm
Variables selection
Sparse Models
Lasso penalty
Group lasso
Interpretability
Explainability
Weighted Kmeans
Origin
Hal imported