Handling Correlations in Random Forests: which Impacts on Variable Importance and Model Interpretability?
MOURER, Alex
Statistique, Analyse et Modélisation Multidisciplinaire (SAmos-Marin Mersenne) [SAMM]
Safran Aircraft Engines
Voir plus >
Statistique, Analyse et Modélisation Multidisciplinaire (SAmos-Marin Mersenne) [SAMM]
Safran Aircraft Engines
MOURER, Alex
Statistique, Analyse et Modélisation Multidisciplinaire (SAmos-Marin Mersenne) [SAMM]
Safran Aircraft Engines
< Réduire
Statistique, Analyse et Modélisation Multidisciplinaire (SAmos-Marin Mersenne) [SAMM]
Safran Aircraft Engines
Langue
en
Communication dans un congrès
Ce document a été publié dans
ESANN 2021 - European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, 2021-10-06, Bruges.
Résumé en anglais
The present manuscript tackles the issues of model interpretability and variable importance in random forests, in the presence of correlated input variables. Variable importance criteria based on random permutations are ...Lire la suite >
The present manuscript tackles the issues of model interpretability and variable importance in random forests, in the presence of correlated input variables. Variable importance criteria based on random permutations are known to be sensitive when input variables are correlated, and may lead for instance to unreliability in the importance ranking. In order to overcome some of the problems raised by correlation, an original variable importance measure is introduced. The proposed measure builds upon an algorithm which clusters the input variables based on their correlations, and summarises each such cluster by a synthetic variable. The effectiveness of the proposed criterion is illustrated through simulations in a regression context, and compared with several existing variable importance measures.< Réduire
Origine
Importé de halUnités de recherche