Handling Correlations in Random Forests: which Impacts on Variable Importance and Model Interpretability?
MOURER, Alex
Statistique, Analyse et Modélisation Multidisciplinaire (SAmos-Marin Mersenne) [SAMM]
Safran Aircraft Engines
See more >
Statistique, Analyse et Modélisation Multidisciplinaire (SAmos-Marin Mersenne) [SAMM]
Safran Aircraft Engines
MOURER, Alex
Statistique, Analyse et Modélisation Multidisciplinaire (SAmos-Marin Mersenne) [SAMM]
Safran Aircraft Engines
< Reduce
Statistique, Analyse et Modélisation Multidisciplinaire (SAmos-Marin Mersenne) [SAMM]
Safran Aircraft Engines
Language
en
Communication dans un congrès
This item was published in
ESANN 2021 - European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, 2021-10-06, Bruges.
English Abstract
The present manuscript tackles the issues of model interpretability and variable importance in random forests, in the presence of correlated input variables. Variable importance criteria based on random permutations are ...Read more >
The present manuscript tackles the issues of model interpretability and variable importance in random forests, in the presence of correlated input variables. Variable importance criteria based on random permutations are known to be sensitive when input variables are correlated, and may lead for instance to unreliability in the importance ranking. In order to overcome some of the problems raised by correlation, an original variable importance measure is introduced. The proposed measure builds upon an algorithm which clusters the input variables based on their correlations, and summarises each such cluster by a synthetic variable. The effectiveness of the proposed criterion is illustrated through simulations in a regression context, and compared with several existing variable importance measures.Read less <
Origin
Hal imported