Explainable Artificial Neural Network for Recurrent Venous Thromboembolism Based on Plasma Proteomics
Langue
EN
Communication dans un congrès avec actes
Ce document a été publié dans
Computational Methods in Systems Biology, International Conference on Computational Methods in Systems Biology, 2021-09-22, Bordeaux. 2021p. 108-121
Résumé en anglais
Venous thromboembolism (VTE) is the third most common cardiovascular disease, affecting ∼ 1,000,000 individuals each year in Europe. VTE is characterized by an annual recurrent rate of ∼ 6%, and ∼ 30% of patients with ...Lire la suite >
Venous thromboembolism (VTE) is the third most common cardiovascular disease, affecting ∼ 1,000,000 individuals each year in Europe. VTE is characterized by an annual recurrent rate of ∼ 6%, and ∼ 30% of patients with unprovoked VTE will face a recurrent event after a six-month course of anticoagulant treatment. Even if guidelines recommend life-long treatment for these patients, about ∼ 70% of them will never experience a recurrence and will receive unnecessary lifelong anti-coagulation that is associated with increased risk of bleeding and is highly costly for the society. There is then urgent need to identify biomarkers that could distinguish VTE patients with high risk of recurrence from low-risk patients. Capitalizing on a sample of 913 patients followed up for the risk of VTE recurrence during a median of ∼ 10 years and profiled for 376 plasma proteomic antibodies, we here develop an artificial neural network (ANN) based strategy to identify a proteomic signature that helps discriminating patients at low and high risk of recurrence. In a first stage, we implemented a Repeated Editing Nearest Neighbors algorithm to select a homogeneous sub-sample of VTE patients. This sub-sample was then split in a training and a testing sets. The former was used for training our ANN, the latter for testing its discriminatory properties. In the testing dataset, our ANN led to an accuracy of 0.86 that compared to an accuracy of 0.79 as provided by a random forest classifier. We then applied a Deep Learning Important FeaTures (DeepLIFT) – based approach to identify the variables that contribute the most to the ANN predictions. In addition to sex, the proposed DeepLIFT strategy identified 6 important proteins (DDX1, HTRA3, LRG1, MAST2, NFATC4 and STXBP5) whose exact roles in the etiology of VTE recurrence now deserve further experimental validations. © 2021, Springer Nature Switzerland AG.< Réduire
Mots clés en anglais
Artificial neural network
Interpretation
Thrombosis
Proteomics
Imbalanced
Unités de recherche