RdRp-scan: A bioinformatic resource to identify and annotate divergent RNA viruses in metagenomic sequence data
CHARON, Justine
Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement [INRAE]
Biologie du fruit et pathologie [BFP]
Voir plus >
Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement [INRAE]
Biologie du fruit et pathologie [BFP]
CHARON, Justine
Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement [INRAE]
Biologie du fruit et pathologie [BFP]
< Réduire
Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement [INRAE]
Biologie du fruit et pathologie [BFP]
Langue
en
Article de revue
Ce document a été publié dans
Virus Evolution. 2022-07-01, vol. 8, n° 2
Oxford University Press
Résumé en anglais
Despite a rapid expansion in the number of documented viruses following the advent of metagenomic sequencing, the identification and annotation of highly divergent RNA viruses remain challenging, particularly from poorly ...Lire la suite >
Despite a rapid expansion in the number of documented viruses following the advent of metagenomic sequencing, the identification and annotation of highly divergent RNA viruses remain challenging, particularly from poorly characterized hosts and environmental samples. Protein structures are more conserved than primary sequence data, such that structure-based comparisons provide an opportunity to reveal the viral ‘dusk matter’: viral sequences with low, but detectable, levels of sequence identity to known viruses with available protein structures. Here, we present a new open computational resource—RdRp-scan—that contains a standardized bioinformatic toolkit to identify and annotate divergent RNA viruses in metagenomic sequence data based on the detection of RNA-dependent RNA polymerase (RdRp) sequences. By combining RdRp-specific hidden Markov models (HMMs) and structural comparisons, we show that RdRp-scan can efficiently detect RdRp sequences with identity levels as low as 10 per cent to those from known viruses and not identifiable using standard sequence-to-sequence comparisons. In addition, to facilitate the annotation and placement of newly detected and divergent virus-like sequences into the diversity of RNA viruses, RdRp-scan provides new custom and curated databases of viral RdRp sequences and core motifs, as well as pre-built RdRp multiple sequence alignments. In parallel, our analysis of the sequence diversity detected by the RdRp-scan revealed that while most of the taxonomically unassigned RdRps fell into pre-established clusters, some fell into potentially new orders of RNA viruses related to the Wolframvirales and Tolivirales. Finally, a survey of the conserved A, B, and C RdRp motifs within the RdRp-scan sequence database revealed additional variations of both sequence and position that might provide new insights into the structure, function, and evolution of viral polymerases.< Réduire
Origine
Importé de halUnités de recherche