Afficher la notice abrégée

hal.structure.identifierStatistics In System biology and Translational Medicine [SISTM]
dc.contributor.authorCAPITAINE, Louis
hal.structure.identifierInstitut de Mathématiques de Bordeaux [IMB]
dc.contributor.authorBIGOT, Jérémie
hal.structure.identifierStatistics In System biology and Translational Medicine [SISTM]
dc.contributor.authorTHIÉBAUT, Rodolphe
hal.structure.identifierStatistics In System biology and Translational Medicine [SISTM]
dc.contributor.authorGENUER, Robin
dc.date.accessioned2024-04-04T02:48:17Z
dc.date.available2024-04-04T02:48:17Z
dc.date.created2024
dc.identifier.urihttps://oskar-bordeaux.fr/handle/20.500.12278/191725
dc.description.abstractEnRandom forests are a statistical learning method widely used in many areas of scientific research because of its ability to learn complex relationships between input and output variables and also its capacity to handle high-dimensional data. However, current random forest approaches are not flexible enough to handle heterogeneous data such as curves, images and shapes. In this paper, we introduce Fréchet trees and Fréchet random forests, which allow to handle data for which input and output variables take values in general metric spaces. To this end, a new way of splitting the nodes of trees is introduced and the prediction procedures of trees and forests are generalized. Then, random forests out-of-bag error and variable importance score are naturally adapted. A consistency theorem for Fréchet regressogram predictor using data-driven partitions is given and applied to Fréchet purely uniformly random trees. The method is studied through several simulation scenarios on heterogeneous data combining longitudinal, image and scalar data. Finally, one real dataset about air quality is used to illustrate the use of the proposed method in practice.
dc.language.isoen
dc.rights.urihttp://creativecommons.org/licenses/by/
dc.subject.enRandom forests
dc.subject.enNonparametric regression
dc.subject.enMetric spaces regression
dc.subject.enLongitudinal data
dc.subject.enHeterogeneous data
dc.subject.enRandom objects
dc.title.enFréchet random forests for metric space valued regression with non Euclidean predictors
dc.typeDocument de travail - Pré-publication
dc.subject.halMathématiques [math]/Statistiques [math.ST]
dc.identifier.arxiv1906.01741
bordeaux.hal.laboratoriesInstitut de Mathématiques de Bordeaux (IMB) - UMR 5251*
bordeaux.institutionUniversité de Bordeaux
bordeaux.institutionBordeaux INP
bordeaux.institutionCNRS
hal.identifierhal-03066146
hal.version1
hal.origin.linkhttps://hal.archives-ouvertes.fr//hal-03066146v1
bordeaux.COinSctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.au=CAPITAINE,%20Louis&BIGOT,%20J%C3%A9r%C3%A9mie&THI%C3%89BAUT,%20Rodolphe&GENUER,%20Robin&rft.genre=preprint


Fichier(s) constituant ce document

FichiersTailleFormatVue

Il n'y a pas de fichiers associés à ce document.

Ce document figure dans la(les) collection(s) suivante(s)

Afficher la notice abrégée