hal.structure.identifier | Analyse, ingénierie et contrôle des micro-organismes [MICROCOSME] | |
hal.structure.identifier | Laboratoire Interdisciplinaire de Physique [Saint Martin d’Hères] [LIPhy] | |
hal.structure.identifier | BIOP - Fluctuations, Régulation et Evolution des Systèmes Vivants [BIOP-LIPhy] | |
dc.contributor.author | BELCOUR, Arnaud | |
hal.structure.identifier | Dynamics, Logics and Inference for biological Systems and Sequences [Dyliss] | |
dc.contributor.author | HAMON-GIRAUD, Pauline | |
hal.structure.identifier | Dynamics, Logics and Inference for biological Systems and Sequences [Dyliss] | |
dc.contributor.author | MATAIGNE, Alice | |
hal.structure.identifier | Dynamics, Logics and Inference for biological Systems and Sequences [Dyliss] | |
dc.contributor.author | RUIZ, Baptiste | |
hal.structure.identifier | Dynamics, Logics and Inference for biological Systems and Sequences [Dyliss] | |
dc.contributor.author | CUNFF, Yann Le | |
hal.structure.identifier | Dynamics, Logics and Inference for biological Systems and Sequences [Dyliss] | |
dc.contributor.author | GOT, Jeanne | |
hal.structure.identifier | Optimisation des procédés en Agriculture, Agroalimentaire et Environnement [UR OPAALE] | |
dc.contributor.author | AWHANGBO, Lorraine | |
hal.structure.identifier | Optimisation des procédés en Agriculture, Agroalimentaire et Environnement [UR OPAALE] | |
dc.contributor.author | LEBRETON, Megane | |
hal.structure.identifier | Pleiade, from patterns to models in computational biodiversity and biotechnology [PLEIADE] | |
dc.contributor.author | FRIOUX, Clémence | |
hal.structure.identifier | Laboratoire de Biologie Intégrative des Modèles Marins [LBI2M] | |
dc.contributor.author | DITTAMI, Simon | |
hal.structure.identifier | Optimisation des procédés en Agriculture, Agroalimentaire et Environnement [UR OPAALE] | |
dc.contributor.author | DABERT, P. | |
hal.structure.identifier | Dynamics, Logics and Inference for biological Systems and Sequences [Dyliss] | |
dc.contributor.author | SIEGEL, Anne | |
hal.structure.identifier | Dynamics, Logics and Inference for biological Systems and Sequences [Dyliss] | |
dc.contributor.author | BLANQUART, Samuel | |
dc.date.created | 2025 | |
dc.description.abstractEn | Purpose: Metabarcoding, and metagenomic sequencing have enabled the characterization of highly diverse environmental communities. The challenge of estimating the metabolic functions carried out by these communities has led to the development of several state-of-the-art methods, most of which are tailored to a specific gene marker. However, the increasing diversity of approaches resulting from advances in sequencing technologies drives the need for methods capable of handling heterogeneous microbial community data. Moreover, predictions often depend on their internal analysis pipelines and are influenced by the underlying databases, which link marker genes to specific functional annotations. This limits users' ability to evaluate the quality of predictions by tracing internal data and processes. Finally, users are constrained by the specific annotations provided by these methods (e.g. EC numbers), limiting their ability to conduct further specialized analyses based on intermediate results.Methods: EsMeCaTa predicts consensus proteomes and their associated functions from taxonomic affiliations. A key feature of EsMeCaTa is its explainability and flexibility. To support the flexible integration of heterogeneous sequencing data, EsMeCaTa utilizes taxonomic affiliations obtained through analyses of diverse sequencing datasets. To provide insight into the knowledge available for each taxonomic affiliation and to interpret the relevance of predicted functions, EsMeCaTa identifies a taxonomic rank within a given affiliation that is sufficiently represented by documented proteomes in the UniProt database. The proteins of the UniProt proteomes are clustered and filtered according to a threshold to create consensus proteomes. These consensus proteomes are automatically annotated with functional information (e.g., EC numbers, GO terms) but they are also designed to be used in further customized annotation workflows. Functional annotations are reported in a functional table, which can be enriched with taxon abundances to generate comprehensive functional profiles.Results: EsMeCaTa predictions have been validated using multiple datasets and compared to a state-of-the-art method. Additionally, it was applied to a novel metabarcoding dataset from a methanogenic reactor, characterizing the microbial community and biogas production across different time points and intake condition. Our results demonstrate the link between biogas production, intake condition and the dynamics of the metabolic functions predicted by EsMeCaTa in the microbial communities. | |
dc.description.sponsorship | Deciphering plant-microbiome interactions to enhance crop defense to bioagressors - ANR-20-PCPA-0004 | |
dc.description.sponsorship | Les origines microbiennes potentielles des propriétés biostimulantes des extraits d'un holobinte d'algues brunes - ANR-20-CE43-0013 | |
dc.language.iso | en | |
dc.rights.uri | http://creativecommons.org/licenses/by/ | |
dc.title.en | Estimating consensus proteomes and metabolic functions from taxonomic affiliations | |
dc.type | Document de travail - Pré-publication | |
dc.identifier.doi | 10.1101/2022.03.16.484574 | |
dc.subject.hal | Informatique [cs]/Bio-informatique [q-bio.QM] | |
hal.identifier | hal-03697249 | |
hal.version | 1 | |
hal.origin.link | https://hal.archives-ouvertes.fr//hal-03697249v1 | |
bordeaux.COinS | ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.au=BELCOUR,%20Arnaud&HAMON-GIRAUD,%20Pauline&MATAIGNE,%20Alice&RUIZ,%20Baptiste&CUNFF,%20Yann%20Le&rft.genre=preprint | |