LOPEZ, Uriel; TRUJILLO, Leonardo; LEGRAND, Pierrick

doi:10.1007/978-3-319-99253-2_17

La plateforme OSKAR Bordeaux évolue pour rejoindre l'archive ouverte HAL. Retrouvez tous vos dépôts sur le nouveau portail HAL UB : https://u-bordeaux.hal.science/. Pour toute aide ou information, contactez-nous info@oskar-bordeaux.fr

Afficher la notice abrégée

hal.structure.identifier	Instituto Tecnológico de Tijuana = Tijuana Institute of Technology [Tijuana]
dc.contributor.author	LOPEZ, Uriel
hal.structure.identifier	Instituto Tecnológico de Tijuana = Tijuana Institute of Technology [Tijuana]
dc.contributor.author	TRUJILLO, Leonardo
hal.structure.identifier	Université de Bordeaux [UB]
hal.structure.identifier	Quality control and dynamic reliability [CQFD]
hal.structure.identifier	Institut de Mathématiques de Bordeaux [IMB]
dc.contributor.author	LEGRAND, Pierrick
dc.contributor.editor	Anne Auger
dc.contributor.editor	Carlos M. Fonseca
dc.contributor.editor	Nuno Lourenço
dc.contributor.editor	Penousal Machado
dc.contributor.editor	Luís Paquete
dc.contributor.editor	Darrell Whitley
dc.date.accessioned	2024-04-04T03:04:42Z
dc.date.available	2024-04-04T03:04:42Z
dc.date.conference	2018-09-08
dc.identifier.uri	https://oskar-bordeaux.fr/handle/20.500.12278/193178
dc.description.abstractEn	Outliers are one of the most difficult issues when dealing with real-world modeling tasks. Even a small percentage of outliers can impede a learning algorithm’s ability to fit a dataset. While robust regression algorithms exist, they fail when a dataset is corrupted by more than 50% of outliers (breakdown point). In the case of Genetic Programming, robust regression has not been properly studied. In this paper we present a method that works as a filter, removing outliers from the target variable (vertical outliers). The algorithm is simple, it uses a randomly generated population of GP trees to determine which target values should be labeled as outliers. The method is highly efficient. Results show that it can return a clean dataset when contamination reaches as high as 90%, and may be able to handle higher levels of contamination. In this study only synthetic univariate benchmarks are used to evaluate the approach, but it must be stressed that no other approaches can deal with such high levels of outlier contamination while requiring such small computational effort.
dc.language.iso	en
dc.publisher	Springer
dc.title.en	Filtering Outliers in One Step with Genetic Programming
dc.type	Communication dans un congrès
dc.identifier.doi	10.1007/978-3-319-99253-2_17
dc.subject.hal	Informatique [cs]/Intelligence artificielle [cs.AI]
dc.subject.hal	Statistiques [stat]
dc.subject.hal	Statistiques [stat]/Machine Learning [stat.ML]
bordeaux.volume	LNCS - Lecture Notes in Computer Science
bordeaux.hal.laboratories	Institut de Mathématiques de Bordeaux (IMB) - UMR 5251	*
bordeaux.issue	11102
bordeaux.institution	Université de Bordeaux
bordeaux.institution	Bordeaux INP
bordeaux.institution	CNRS
bordeaux.conference.title	PPSN 2018 - Fifteenth International Conference on Parallel Problem Solving from Nature (PPSN XV)
bordeaux.country	PT
bordeaux.conference.city	Coimbra
bordeaux.peerReviewed	oui
hal.identifier	hal-01910433
hal.version	1
hal.invited	non
hal.proceedings	oui
hal.conference.end	2018-09-12
hal.popular	non
hal.audience	Internationale
hal.origin.link	https://hal.archives-ouvertes.fr//hal-01910433v1
bordeaux.COinS	ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.volume=LNCS%20-%20Lecture%20Notes%20in%20Computer%20Science&rft.issue=11102&rft.au=LOPEZ,%20Uriel&TRUJILLO,%20Leonardo&LEGRAND,%20Pierrick&rft.genre=unknown

Fichier(s) constituant ce document

Fichiers	Taille	Format	Vue
Il n'y a pas de fichiers associés à ce document.

Ce document figure dans la(les) collection(s) suivante(s)

Institut de Mathématiques de Bordeaux (IMB) - UMR 5251

Afficher la notice abrégée

Filtering Outliers in One Step with Genetic Programming

Fichier(s) constituant ce document

Ce document figure dans la(les) collection(s) suivante(s)