Data-Driven Sparse Partial Least Squares
hal.structure.identifier | Méthodes avancées d’apprentissage statistique et de contrôle [ASTRAL] | |
dc.contributor.author | LORENZO, Hadrien | |
hal.structure.identifier | Sartorius Stedim France S.A.S. [Aubagne] | |
dc.contributor.author | CLOAREC, Olivier | |
hal.structure.identifier | Statistics In System biology and Translational Medicine [SISTM] | |
hal.structure.identifier | Bordeaux population health [BPH] | |
hal.structure.identifier | Vaccine Research Institute [Créteil, France] [VRI] | |
dc.contributor.author | THIÉBAUT, Rodolphe | |
hal.structure.identifier | Méthodes avancées d’apprentissage statistique et de contrôle [ASTRAL] | |
dc.contributor.author | SARACCO, Jérôme | |
dc.date.accessioned | 2024-04-04T02:45:19Z | |
dc.date.available | 2024-04-04T02:45:19Z | |
dc.date.issued | 2022 | |
dc.identifier.issn | 1932-1864 | |
dc.identifier.uri | https://oskar-bordeaux.fr/handle/20.500.12278/191456 | |
dc.description.abstractEn | In the supervised high dimensional settings with a large number of variables and a low number of individuals, variable selection allows a simpler interpretation and more reliable predictions. That subspace selection is often managed with supervised tools when the real question is motivated by variable prediction. We propose a Partial Least Square (PLS) based method, called data-driven sparse PLS (ddsPLS), allowing variable selection both in the covariate and the response parts using a single hyper-parameter per component. The subspace estimation is also performed by tuning a number of underlying parameters. The ddsPLS method is compared to existing methods such as classical PLS and two well established sparse PLS methods through numerical simulations. The observed results are promising both in terms of variable selection and prediction performance. This methodology is based on new prediction quality descriptors associated with the classical R 2 and Q 2 and uses bootstrap sampling to tune parameters and select an optimal regression model. | |
dc.language.iso | en | |
dc.publisher | Wiley | |
dc.subject.en | PLS regression | |
dc.subject.en | Supervised learning | |
dc.subject.en | Variable selection | |
dc.subject.en | Soft thresholding | |
dc.subject.en | Multi-block data | |
dc.title.en | Data-Driven Sparse Partial Least Squares | |
dc.type | Article de revue | |
dc.identifier.doi | 10.1002/sam.11558 | |
dc.subject.hal | Mathématiques [math]/Statistiques [math.ST] | |
bordeaux.journal | Statistical Analysis and Data Mining | |
bordeaux.page | 264-282 | |
bordeaux.volume | 15 | |
bordeaux.hal.laboratories | Institut de Mathématiques de Bordeaux (IMB) - UMR 5251 | * |
bordeaux.issue | 2 | |
bordeaux.institution | Université de Bordeaux | |
bordeaux.institution | Bordeaux INP | |
bordeaux.institution | CNRS | |
bordeaux.peerReviewed | oui | |
hal.identifier | hal-03368956 | |
hal.version | 1 | |
hal.popular | non | |
hal.audience | Internationale | |
hal.origin.link | https://hal.archives-ouvertes.fr//hal-03368956v1 | |
bordeaux.COinS | ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.jtitle=Statistical%20Analysis%20and%20Data%20Mining&rft.date=2022&rft.volume=15&rft.issue=2&rft.spage=264-282&rft.epage=264-282&rft.eissn=1932-1864&rft.issn=1932-1864&rft.au=LORENZO,%20Hadrien&CLOAREC,%20Olivier&THI%C3%89BAUT,%20Rodolphe&SARACCO,%20J%C3%A9r%C3%B4me&rft.genre=article |
Fichier(s) constituant ce document
Fichiers | Taille | Format | Vue |
---|---|---|---|
Il n'y a pas de fichiers associés à ce document. |