HAJJAR, Ghina; BENABEN, David; PAULHE, Nils; DUPERIER, Christophe; FILANGI, Olivier; GIACOMONI, Franck; COMTE, Blandine; PUJOS-GUILLOT, Estelle

La plateforme OSKAR Bordeaux évolue pour rejoindre l'archive ouverte HAL. Retrouvez tous vos dépôts sur le nouveau portail HAL UB : https://u-bordeaux.hal.science/ Pour toute aide ou information, contactez-nous : info@oskar-bordeaux.fr

hal.structure.identifier	Plateforme Exploration du Métabolisme [PFEM]
dc.contributor.author	HAJJAR, Ghina
hal.structure.identifier	Biologie du fruit et pathologie [BFP]
hal.structure.identifier	Plateforme Bordeaux Metabolome
dc.contributor.author	BENABEN, David
hal.structure.identifier	Plateforme Exploration du Métabolisme [PFEM]
dc.contributor.author	PAULHE, Nils
hal.structure.identifier	Plateforme Exploration du Métabolisme [PFEM]
dc.contributor.author	DUPERIER, Christophe
hal.structure.identifier	Institut de Génétique, Environnement et Protection des Plantes [IGEPP]
dc.contributor.author	FILANGI, Olivier
hal.structure.identifier	Plateforme Exploration du Métabolisme [PFEM]
dc.contributor.author	GIACOMONI, Franck
hal.structure.identifier	Plateforme Exploration du Métabolisme [PFEM]
dc.contributor.author	COMTE, Blandine
hal.structure.identifier	Plateforme Exploration du Métabolisme [PFEM]
dc.contributor.author	PUJOS-GUILLOT, Estelle
dc.date.conference	2022-09-05
dc.description.abstractEn	Since the emergence of high throughput metabolomics, there has been a growing number of scientific communities performing metabolomic studies. Therefore, it has become crucial to standardize reporting and sharing of metabolites. Although minimum reporting standards for analytical practices and data processing are available, there are no established standards for metabolite reporting. In this context, our objective was to review the existing practices in terms of metabolite reporting in different scientific communities both in published results and across databases.In this context, we considered plasma metabolites reported in human large-scale studies from different communities, namely analytical chemistry, medicine and epidemiology. We focused only on metabolites reported as level 1 identification according to the Metabolomics Standard Initiative. We applied a data curation workflow on the list of annotated metabolites given by the authors. First, we performed a manual curation that included the addition of missing identifiers and the editing of some incoherent metadata. Second, we applied an automatic query algorithm in order to obtain additional information from available databases such as the compact hash code of the IUPAC International Chemical Identifier “InChIKey”. Identified metabolites were then compared between the selected studies using either the names given by the authors or the InChIKeys added after data curation. Regular inconsistencies were observed in metabolite reporting both in published results and across different databases. In the former, incoherence was observed in the metabolite information (identifiers not referring to the same isomer, metabolite name not corresponding to the molecular formula). Besides, isomers were listed with their corresponding retention times, yet without any indication of the isomers’ identity. On the other hand, cross-linking provided across databases presented some incoherent information regarding nomenclatures, optical isomerism, stereochemistry of asymmetric carbons, and molecular structure (acid/base; zwitterionic or canonical forms, molecules with a permanent charge) in addition to a mismatch between two structurally different compounds. The evaluation of metabolite reporting across different databases for instance HMDB, PubChem and ChEBI was performed with the help of the Metabolomics Semantic DataLake (MSD) team. Information was calculated from latest public versions of the aforementioned databases, under a Big Data infrastructure (Apache Spark) and Scala programming language. Based on the InChIKey, we were able to identify all incorrect metabolite matches in HMDB, PubChem and ChEBI and to categorize them into “structurally different compounds”, “optical isomerism” or “structural isomerism”.Although not yet required, the InChIKey was found to be the most suitable identifier for comparing reported metabolites between studies and across databases. It is therefore recommended either to use this identifier or to perform a deep data curation when reporting identified metabolites. This work will allow providing guidelines for a more effective and reproducible metabolomics data sharing.
dc.language.iso	en
dc.title.en	Metabolite reporting in large-scale studies within different metabolomics communities: DO WE SPEAK THE SAME LANGUAGE?
dc.type	Autre communication scientifique (congrès sans actes - poster - séminaire...)
dc.subject.hal	Informatique [cs]/Bio-informatique [q-bio.QM]
dc.subject.hal	Chimie/Chimie analytique
dc.subject.hal	Informatique [cs]/Base de données [cs.DB]
dc.subject.hal	Sciences du Vivant [q-bio]/Santé publique et épidémiologie
dc.subject.hal	Sciences du Vivant [q-bio]/Alimentation et Nutrition
bordeaux.conference.title	Analytics 2022
bordeaux.country	FR
bordeaux.conference.city	Nantes
bordeaux.peerReviewed	oui
hal.identifier	hal-03775474
hal.version	1
hal.invited	non
hal.proceedings	non
hal.conference.end	2022-09-08
hal.popular	non
hal.audience	Internationale
hal.origin.link	https://hal.archives-ouvertes.fr//hal-03775474v1
bordeaux.COinS	ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.au=HAJJAR,%20Ghina&BENABEN,%20David&PAULHE,%20Nils&DUPERIER,%20Christophe&FILANGI,%20Olivier&rft.genre=conference

Fichier(s) constituant ce document

Fichiers	Taille	Format	Vue
Il n'y a pas de fichiers associés à ce document.

Ce document figure dans la(les) collection(s) suivante(s)

Biologie du Fruit & Pathologie (BFP) - UMR 1332

Afficher la notice abrégée

Metabolite reporting in large-scale studies within different metabolomics communities: DO WE SPEAK THE SAME LANGUAGE?

Fichier(s) constituant ce document

Ce document figure dans la(les) collection(s) suivante(s)