ClustOfVar : an R package for dimension reduction via clustering of variables. Application in supervised classification and variable selection in gene expressions data
hal.structure.identifier | Quality control and dynamic reliability [CQFD] | |
dc.contributor.author | CHAVENT, Marie | |
hal.structure.identifier | Equipe de Biostatistique | |
dc.contributor.author | GENUER, Robin | |
hal.structure.identifier | Aménités et dynamiques des espaces ruraux [UR ADBX] | |
dc.contributor.author | KUENTZ-SIMONET, Vanessa | |
hal.structure.identifier | Equipe de Biostatistique | |
dc.contributor.author | LIQUET, Benoit | |
hal.structure.identifier | Quality control and dynamic reliability [CQFD] | |
dc.contributor.author | SARACCO, Jerôme | |
dc.date.accessioned | 2024-04-04T02:20:25Z | |
dc.date.available | 2024-04-04T02:20:25Z | |
dc.date.conference | 2013-01-24 | |
dc.identifier.uri | https://oskar-bordeaux.fr/handle/20.500.12278/189490 | |
dc.description.abstractEn | The main goal of this work is to tackle the problem of dimension reduction for high-dimensional supervised classication. The motivation is to handle gene expression data. The proposed method works in 2 steps. First, one eliminates redundancy using clustering of variables, based on the R-package ClustOfVar. This first step is only based on the exploratory variables (genes). Second, the synthetic variables (summarizing the clusters obtained at the first step) are used to construct a classifier (e.g. logistic regression, LDA, random forests). We stress that the first step reduces the dimension and gives linear combinations of original variables (synthetic variables). This step can be considered as an alternative to PCA. A selection of predictors (synthetic variables) in the second step gives a set of relevant original variables (genes). Numerical performances of the proposed procedure are evaluated on gene expression datasets. We compare our methodology with LASSO and sparse PLS discriminant analysis on these datasets. | |
dc.language.iso | en | |
dc.title.en | ClustOfVar : an R package for dimension reduction via clustering of variables. Application in supervised classification and variable selection in gene expressions data | |
dc.type | Communication dans un congrès | |
dc.subject.hal | Statistiques [stat]/Applications [stat.AP] | |
bordeaux.hal.laboratories | Institut de Mathématiques de Bordeaux (IMB) - UMR 5251 | * |
bordeaux.institution | Université de Bordeaux | |
bordeaux.institution | Bordeaux INP | |
bordeaux.institution | CNRS | |
bordeaux.conference.title | Statistical Methods for (post)-Genomics Data (SMPGD 2013) | |
bordeaux.country | NL | |
bordeaux.peerReviewed | oui | |
hal.identifier | hal-00926216 | |
hal.version | 1 | |
hal.invited | non | |
hal.proceedings | non | |
hal.popular | non | |
hal.audience | Internationale | |
hal.origin.link | https://hal.archives-ouvertes.fr//hal-00926216v1 | |
bordeaux.COinS | ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.au=CHAVENT,%20Marie&GENUER,%20Robin&KUENTZ-SIMONET,%20Vanessa&LIQUET,%20Benoit&SARACCO,%20Jer%C3%B4me&rft.genre=unknown |
Fichier(s) constituant ce document
Fichiers | Taille | Format | Vue |
---|---|---|---|
Il n'y a pas de fichiers associés à ce document. |