Afficher la notice abrégée

hal.structure.identifierInstituto Tecnológico de Tijuana = Tijuana Institute of Technology [Tijuana]
dc.contributor.authorLOPEZ, Uriel
hal.structure.identifierInstituto Tecnológico de Tijuana = Tijuana Institute of Technology [Tijuana]
dc.contributor.authorTRUJILLO, Leonardo
hal.structure.identifierUniversité de Bordeaux [UB]
hal.structure.identifierQuality control and dynamic reliability [CQFD]
hal.structure.identifierInstitut de Mathématiques de Bordeaux [IMB]
dc.contributor.authorLEGRAND, Pierrick
dc.contributor.editorAnne Auger
dc.contributor.editorCarlos M. Fonseca
dc.contributor.editorNuno Lourenço
dc.contributor.editorPenousal Machado
dc.contributor.editorLuís Paquete
dc.contributor.editorDarrell Whitley
dc.date.accessioned2024-04-04T03:04:42Z
dc.date.available2024-04-04T03:04:42Z
dc.date.conference2018-09-08
dc.identifier.urihttps://oskar-bordeaux.fr/handle/20.500.12278/193178
dc.description.abstractEnOutliers are one of the most difficult issues when dealing with real-world modeling tasks. Even a small percentage of outliers can impede a learning algorithm’s ability to fit a dataset. While robust regression algorithms exist, they fail when a dataset is corrupted by more than 50% of outliers (breakdown point). In the case of Genetic Programming, robust regression has not been properly studied. In this paper we present a method that works as a filter, removing outliers from the target variable (vertical outliers). The algorithm is simple, it uses a randomly generated population of GP trees to determine which target values should be labeled as outliers. The method is highly efficient. Results show that it can return a clean dataset when contamination reaches as high as 90%, and may be able to handle higher levels of contamination. In this study only synthetic univariate benchmarks are used to evaluate the approach, but it must be stressed that no other approaches can deal with such high levels of outlier contamination while requiring such small computational effort.
dc.language.isoen
dc.publisherSpringer
dc.title.enFiltering Outliers in One Step with Genetic Programming
dc.typeCommunication dans un congrès
dc.identifier.doi10.1007/978-3-319-99253-2_17
dc.subject.halInformatique [cs]/Intelligence artificielle [cs.AI]
dc.subject.halStatistiques [stat]
dc.subject.halStatistiques [stat]/Machine Learning [stat.ML]
bordeaux.volumeLNCS - Lecture Notes in Computer Science
bordeaux.hal.laboratoriesInstitut de Mathématiques de Bordeaux (IMB) - UMR 5251*
bordeaux.issue11102
bordeaux.institutionUniversité de Bordeaux
bordeaux.institutionBordeaux INP
bordeaux.institutionCNRS
bordeaux.conference.titlePPSN 2018 - Fifteenth International Conference on Parallel Problem Solving from Nature (PPSN XV)
bordeaux.countryPT
bordeaux.conference.cityCoimbra
bordeaux.peerReviewedoui
hal.identifierhal-01910433
hal.version1
hal.invitednon
hal.proceedingsoui
hal.conference.end2018-09-12
hal.popularnon
hal.audienceInternationale
hal.origin.linkhttps://hal.archives-ouvertes.fr//hal-01910433v1
bordeaux.COinSctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.volume=LNCS%20-%20Lecture%20Notes%20in%20Computer%20Science&rft.issue=11102&rft.au=LOPEZ,%20Uriel&TRUJILLO,%20Leonardo&LEGRAND,%20Pierrick&rft.genre=unknown


Fichier(s) constituant ce document

FichiersTailleFormatVue

Il n'y a pas de fichiers associés à ce document.

Ce document figure dans la(les) collection(s) suivante(s)

Afficher la notice abrégée