Class-imbalanced subsampling lasso algorithm for discovering adverse drug reactions
Langue
EN
Article de revue
Ce document a été publié dans
Statistical Methods in Medical Research. 2018-03, vol. 27, n° 3, p. 785-797
Résumé en anglais
BACKGROUND: All methods routinely used to generate safety signals from pharmacovigilance databases rely on disproportionality analyses of counts aggregating patients' spontaneous reports. Recently, it was proposed to analyze ...Lire la suite >
BACKGROUND: All methods routinely used to generate safety signals from pharmacovigilance databases rely on disproportionality analyses of counts aggregating patients' spontaneous reports. Recently, it was proposed to analyze individual spontaneous reports directly using Bayesian lasso logistic regressions. Nevertheless, this raises the issue of choosing an adequate regularization parameter in a variable selection framework while accounting for computational constraints due to the high dimension of the data. PURPOSE: Our main objective is to propose a method, which exploits the subsampling idea from Stability Selection, a variable selection procedure combining subsampling with a high-dimensional selection algorithm, and adapts it to the specificities of the spontaneous reporting data, the latter being characterized by their large size, their binary nature and their sparsity. MATERIALS AND METHOD: Given the large imbalance existing between the presence and absence of a given adverse event, we propose an alternative subsampling scheme to that of Stability Selection resulting in an over-representation of the minority class and a drastic reduction in the number of observations in each subsample. Simulations are used to help define the detection threshold as regards the average proportion of false signals. They are also used to compare the performances of the proposed sampling scheme with that originally proposed for Stability Selection. Finally, we compare the proposed method to the gamma Poisson shrinker, a disproportionality method, and to a lasso logistic regression approach through an empirical study conducted on the French national pharmacovigilance database and two sets of reference signals. RESULTS: Simulations show that the proposed sampling strategy performs better in terms of false discoveries and is faster than the equiprobable sampling of Stability Selection. The empirical evaluation illustrates the better performances of the proposed method compared with gamma Poisson shrinker and the lasso in terms of number of reference signals retrieved.< Réduire
Mots clés en anglais
PharmacoEpi-Drugs
Unités de recherche