Afficher la notice abrégée

dc.rights.licenseopenen_US
hal.structure.identifierStatistics In System biology and Translational Medicine [SISTM]
hal.structure.identifierBordeaux population health [BPH]
dc.contributor.authorCAPITAINE, Louis
hal.structure.identifierInstitut de Mathématiques de Bordeaux [IMB]
dc.contributor.authorBIGOT, Jeremie
IDREF: 075404877
hal.structure.identifierStatistics In System biology and Translational Medicine [SISTM]
hal.structure.identifierBordeaux population health [BPH]
dc.contributor.authorTHIEBAUT, Rodolphe
hal.structure.identifierStatistics In System biology and Translational Medicine [SISTM]
hal.structure.identifierBordeaux population health [BPH]
dc.contributor.authorGENUER, Robin
dc.date.accessioned2021-05-07T12:42:41Z
dc.date.available2021-05-07T12:42:41Z
dc.identifier.urihttps://oskar-bordeaux.fr/handle/20.500.12278/27208
dc.description.abstractEnRandom forests are a statistical learning method widely used in many areas of scientific research because of its ability to learn complex relationships between input and output variables and also their capacity to handle high-dimensional data. However, current random forest approaches are not flexible enough to handle heterogeneous data such as curves, images and shapes. In this paper, we introduce Fr\'echet trees and Fr\'echet random forests, which allow to handle data for which input and output variables take values in general metric spaces (which can be unordered). To this end, a new way of splitting the nodes of trees is introduced and the prediction procedures of trees and forests are generalized. Then, random forests out-of-bag error and variable importance score are naturally adapted. A consistency theorem for Fr\'echet regressogram predictor using data-driven partitions is given and applied to Fr\'echet purely uniformly random trees. The method is studied through several simulation scenarios on heterogeneous data combining longitudinal, image and scalar data. Finally, two real datasets from HIV vaccine trials are analyzed with the proposed method.
dc.language.isoENen_US
dc.title.enFréchet random forests for metric space valued regression with non Euclidean predictors
dc.typeDocument de travail - Pré-publicationen_US
dc.subject.halMathématiques [math]/Statistiques [math.ST]en_US
dc.identifier.arxiv1906.01741en_US
bordeaux.hal.laboratoriesBordeaux Population Health Research Center (BPH) - U1219en_US
bordeaux.institutionUniversité de Bordeauxen_US
bordeaux.institutionINSERMen_US
bordeaux.teamSISTMen_US
bordeaux.teamSISTM_BPH
bordeaux.import.sourcehal
hal.identifierhal-03066146
hal.version1
hal.exportfalse
workflow.import.sourcehal
bordeaux.COinSctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.au=CAPITAINE,%20Louis&BIGOT,%20Jeremie&THIEBAUT,%20Rodolphe&GENUER,%20Robin&rft.genre=preprint


Fichier(s) constituant ce document

FichiersTailleFormatVue

Il n'y a pas de fichiers associés à ce document.

Ce document figure dans la(les) collection(s) suivante(s)

Afficher la notice abrégée