Handling Correlations in Random Forests: which Impacts on Variable Importance and Model Interpretability?
MOURER, Alex
Statistique, Analyse et Modélisation Multidisciplinaire (SAmos-Marin Mersenne) [SAMM]
Safran Aircraft Engines
Leer más >
Statistique, Analyse et Modélisation Multidisciplinaire (SAmos-Marin Mersenne) [SAMM]
Safran Aircraft Engines
MOURER, Alex
Statistique, Analyse et Modélisation Multidisciplinaire (SAmos-Marin Mersenne) [SAMM]
Safran Aircraft Engines
< Leer menos
Statistique, Analyse et Modélisation Multidisciplinaire (SAmos-Marin Mersenne) [SAMM]
Safran Aircraft Engines
Idioma
en
Communication dans un congrès
Este ítem está publicado en
ESANN 2021 - European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, 2021-10-06, Bruges.
Resumen en inglés
The present manuscript tackles the issues of model interpretability and variable importance in random forests, in the presence of correlated input variables. Variable importance criteria based on random permutations are ...Leer más >
The present manuscript tackles the issues of model interpretability and variable importance in random forests, in the presence of correlated input variables. Variable importance criteria based on random permutations are known to be sensitive when input variables are correlated, and may lead for instance to unreliability in the importance ranking. In order to overcome some of the problems raised by correlation, an original variable importance measure is introduced. The proposed measure builds upon an algorithm which clusters the input variables based on their correlations, and summarises each such cluster by a synthetic variable. The effectiveness of the proposed criterion is illustrated through simulations in a regression context, and compared with several existing variable importance measures.< Leer menos
Orígen
Importado de HalCentros de investigación