Afficher la notice abrégée

dc.rights.licenseopenen_US
hal.structure.identifierStatistics In System biology and Translational Medicine [SISTM]
hal.structure.identifierBordeaux population health [BPH]
dc.contributor.authorHIVERT, Benjamin
dc.contributor.authorAGNIEL, Denis
hal.structure.identifierStatistics In System biology and Translational Medicine [SISTM]
hal.structure.identifierBordeaux population health [BPH]
dc.contributor.authorTHIEBAUT, Rodolphe
hal.structure.identifierStatistics In System biology and Translational Medicine [SISTM]
hal.structure.identifierBordeaux population health [BPH]
dc.contributor.authorHEJBLUM, Boris
ORCID: 0000-0003-0646-452X
IDREF: 189970316
dc.date.accessioned2023-03-07T15:04:56Z
dc.date.available2023-03-07T15:04:56Z
dc.identifier.urihttps://oskar-bordeaux.fr/handle/20.500.12278/172203
dc.description.abstractEnClustering is part of unsupervised analysis methods that consist in grouping samples into homogeneous and separate subgroups of observations also called clusters. To interpret the clusters, statistical hypothesis testing is often used to infer the variables that significantly separate the estimated clusters from each other. However, data-driven hypotheses are considered for the inference process, since the hypotheses are derived from the clustering results. This double use of the data leads traditional hypothesis test to fail to control the Type I error rate particularly because of uncertainty in the clustering process and the potential artificial differences it could create. We propose three novel statistical hypothesis tests which account for the clustering process. Our tests efficiently control the Type I error rate by identifying only variables that contain a true signal separating groups of observations.
dc.language.isoENen_US
dc.subject.enClustering
dc.subject.enHypothesis testing
dc.subject.enDouble-dipping
dc.subject.enCircular analysis
dc.subject.enSelective inference
dc.subject.enMultimodality test
dc.subject.enDip Test
dc.title.enPost-clustering difference testing: valid inference and practical considerations
dc.typeDocument de travail - Pré-publicationen_US
dc.subject.halSciences du Vivant [q-bio]/Santé publique et épidémiologieen_US
bordeaux.hal.laboratoriesBordeaux Population Health Research Center (BPH) - UMR 1219en_US
bordeaux.institutionUniversité de Bordeauxen_US
bordeaux.institutionINSERMen_US
bordeaux.teamSISTM_BPHen_US
hal.exportfalse
dc.rights.ccPas de Licence CCen_US
bordeaux.COinSctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.au=HIVERT,%20Benjamin&AGNIEL,%20Denis&THIEBAUT,%20Rodolphe&HEJBLUM,%20Boris&rft.genre=preprint


Fichier(s) constituant ce document

Thumbnail

Ce document figure dans la(les) collection(s) suivante(s)

Afficher la notice abrégée