Show simple item record

dc.rights.licenseopenen_US
dc.contributor.authorNAHAL, Yasmine
dc.contributor.authorMENKE, Janosch
hal.structure.identifierStatistics In System biology and Translational Medicine [SISTM]
hal.structure.identifierBordeaux population health [BPH]
dc.contributor.authorMARTINELLI, Julien
dc.contributor.authorHEINONEN, Markus
dc.contributor.authorKABESHOV, Mikhail
dc.contributor.authorJANET, Jon Paul
dc.contributor.authorNITTINGER, Eva
dc.contributor.authorENGKVIST, Ola
dc.contributor.authorKASKI, Samuel
dc.date.accessioned2025-02-12T12:54:02Z
dc.date.available2025-02-12T12:54:02Z
dc.date.issued2024-12-09
dc.identifier.issn1758-2946en_US
dc.identifier.urihttps://oskar-bordeaux.fr/handle/20.500.12278/204818
dc.description.abstractEnMachine learning (ML) systems have enabled the modelling of quantitative structure-property relationships (QSPR) and structure-activity relationships (QSAR) using existing experimental data to predict target properties for new molecules. These property predictors hold significant potential in accelerating drug discovery by guiding generative artificial intelligence (AI) agents to explore desired chemical spaces. However, they often struggle to generalize due to the limited scope of the training data. When optimized by generative agents, this limitation can result in the generation of molecules with artificially high predicted probabilities of satisfying target properties, which subsequently fail experimental validation. To address this challenge, we propose an adaptive approach that integrates active learning (AL) and iterative feedback to refine property predictors, thereby improving the outcomes of their optimization by generative AI agents. Our method leverages the Expected Predictive Information Gain (EPIG) criterion to select additional molecules for evaluation by an oracle. This process aims to provide the greatest reduction in predictive uncertainty, enabling more accurate model evaluations of subsequently generated molecules. Recognizing the impracticality of immediate wet-lab or physics-based experiments due to time and logistical constraints, we propose leveraging human experts for their cost-effectiveness and domain knowledge to effectively augment property predictors, bridging gaps in the limited training data. Empirical evaluations through both simulated and real human-in-the-loop experiments demonstrate that our approach refines property predictors to better align with oracle assessments. Additionally, we observe improved accuracy of predicted properties as well as improved drug-likeness among the top-ranking generated molecules. SCIENTIFIC CONTRIBUTION: We present an adaptable framework that integrates AL and human expertise to refine property predictors for goal-oriented molecule generation. This approach is robust to noise in human feedback and ensures that navigating chemical space with human-refined predictors leverages human insights to identify molecules that not only satisfy predicted property profiles but also score highly on oracle models. Additionally, it prioritizes practical characteristics such as drug-likeness, synthetic accessibility, and a favorable balance between exploring diverse chemical space and exploiting similarity to existing training data.
dc.language.isoENen_US
dc.rightsAttribution 3.0 United States*
dc.rights.urihttp://creativecommons.org/licenses/by/3.0/us/*
dc.subject.enActive learning
dc.subject.enGoal-oriented molecule generation
dc.subject.enHuman-in-the-loop
dc.subject.enInteractive algorithms
dc.subject.enMachine learning
dc.title.enHuman-in-the-loop active learning for goal-oriented molecule generation
dc.title.alternativeJ Cheminformen_US
dc.typeArticle de revueen_US
dc.identifier.doi10.1186/s13321-024-00924-yen_US
dc.subject.halSciences du Vivant [q-bio]/Santé publique et épidémiologieen_US
dc.identifier.pubmed39654043en_US
bordeaux.journalJournal of Cheminformaticsen_US
bordeaux.page138en_US
bordeaux.volume16en_US
bordeaux.hal.laboratoriesBordeaux Population Health Research Center (BPH) - UMR 1219en_US
bordeaux.issue1en_US
bordeaux.institutionUniversité de Bordeauxen_US
bordeaux.institutionINSERMen_US
bordeaux.institutionINRIAen_US
bordeaux.teamSISTM_BPHen_US
bordeaux.peerReviewedouien_US
bordeaux.inpressnonen_US
bordeaux.identifier.funderIDHorizon 2020en_US
bordeaux.identifier.funderIDKnut och Alice Wallenbergs Stiftelseen_US
hal.identifierhal-04942779
hal.version1
hal.date.transferred2025-02-12T12:54:06Z
hal.popularnonen_US
hal.audienceInternationaleen_US
hal.exporttrue
dc.rights.ccPas de Licence CCen_US
bordeaux.COinSctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.jtitle=Journal%20of%20Cheminformatics&rft.date=2024-12-09&rft.volume=16&rft.issue=1&rft.spage=138&rft.epage=138&rft.eissn=1758-2946&rft.issn=1758-2946&rft.au=NAHAL,%20Yasmine&MENKE,%20Janosch&MARTINELLI,%20Julien&HEINONEN,%20Markus&KABESHOV,%20Mikhail&rft.genre=article


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record