CHAVENT, Marie; GENUER, Robin; SARACCO, Jerome

doi:10.1080/03610918.2018.1563145

El sistema se apagará debido a tareas habituales de mantenimiento. Por favor, guarde su trabajo y desconéctese.

Metadatos

Mostrar el registro completo del ítem

Licencia de uso del documento

CHAVENT, Marie
Quality control and dynamic reliability [CQFD]
Institut de Mathématiques de Bordeaux [IMB]

GENUER, Robin
Statistics In System biology and Translational Medicine [SISTM]

SARACCO, Jerome
Quality control and dynamic reliability [CQFD]
Institut de Mathématiques de Bordeaux [IMB]
Ecole Nationale Supérieure de Cognitique [ENSC]

Idioma

Article de revue

Este ítem está publicado en

Communications in Statistics - Simulation and Computation. 2021-01-11, vol. 50, n° 2, p. 426-445

Taylor & Francis

Resumen en inglés

Standard approaches to tackle high-dimensional supervised classification often include variable selection and dimension reduction. The proposed methodology combines clustering of variables and feature selection. Hierarchical clustering of variables allows to built groups of correlated variables and summarizes each group by a synthetic variable. Originality is that groups of variables are unknown a priori. Moreover clustering approach deals with both numerical and categorical variables. Among all the possible partitions, the most relevant synthetic variables are selected with a procedure using random forests. Numerical performances are illustrated on simulated and real datasets. Selection of groups of variables provides easier interpretation of results.< Leer menos

Palabras clave en inglés

Clustering of variables

Random forests

Supervised classification

Variable selection

Metadatos

Licencia de uso del documento

Combining clustering of variables and feature selection using random forests

Idioma

Este ítem está publicado en

Resumen en inglés

Palabras clave en inglés

DOI

Orígen

Centros de investigación