De novo construction of a “Gene-space” for diploid plant genome rich in repetitive sequences by an iterative Process of Extraction and Assembly of NGS reads (iPEA protocol) with limited computing resources
hal.structure.identifier | Biodiversité, Gènes & Communautés [BioGeCo] | |
dc.contributor.author | ALUOME, Christelle | |
hal.structure.identifier | Agroécologie [Dijon] | |
dc.contributor.author | AUBERT, Gregoire | |
hal.structure.identifier | Agroécologie [Dijon] | |
dc.contributor.author | ALVES CARVALHO, Susete | |
hal.structure.identifier | Etude du Polymorphisme des Génomes Végétaux [EPGV] | |
dc.contributor.author | LE PASLIER, Marie-Christine | |
hal.structure.identifier | Agroécologie [Dijon] | |
dc.contributor.author | BURSTIN, Judith | |
hal.structure.identifier | Etude du Polymorphisme des Génomes Végétaux [EPGV] | |
dc.contributor.author | BRUNEL, Dominique | |
dc.date.issued | 2016 | |
dc.identifier.issn | 1756-0500 | |
dc.description.abstractEn | The continuing increase in size and quality of the “short reads” raw data is a significant help for the quality of the assembly obtained through various bioinformatics tools. However, building a reference genome sequence for most plant species remains a significant challenge due to the large number of repeated sequences which are problematic for a whole-genome quality de novo assembly. Furthermore, for most SNP identification approaches in plant genetics and breeding, only the “Gene-space” regions including the promoter, exon and intron sequences are considered. Results: We developed the iPea protocol to produce a de novo Gene-space assembly by reconstructing, in an iterative way, the non-coding sequence flanking the Unigene cDNA sequence through addition of next-generation DNA-seq data. The approach was elaborated with the large diploid genome of pea (<em>Pisum sativum</em> L.), rich in repetitive sequences. The final Gene-space assembly included 35,400 contigs (97 Mb), covering 88 % of the 40,227 contigs (53.1 Mb) of the PsCam_low-copy Unigen set. Its accuracy was validated by the results of the built GenoPea 13.2 K SNP Array. Conclusion: The iPEA protocol allows the reconstruction of a Gene-space based from RNA-Seq and DNA-seq data with limited computing resources. | |
dc.language.iso | en | |
dc.publisher | BioMed Central | |
dc.subject.en | gene-space | |
dc.subject.en | unigene | |
dc.subject.en | next-generation sequencing NGS | |
dc.subject.en | assembly | |
dc.subject.en | lterative process | |
dc.subject.en | limited computing resources | |
dc.title.en | De novo construction of a “Gene-space” for diploid plant genome rich in repetitive sequences by an iterative Process of Extraction and Assembly of NGS reads (iPEA protocol) with limited computing resources | |
dc.type | Article de revue | |
dc.identifier.doi | 10.1186/s13104-016-1903-z | |
dc.subject.hal | Sciences du Vivant [q-bio] | |
dc.subject.hal | Sciences du Vivant [q-bio]/Biologie végétale | |
dc.subject.hal | Sciences de l'environnement | |
bordeaux.journal | BMC Research Notes | |
bordeaux.page | 1-9 | |
bordeaux.volume | 9 | |
bordeaux.issue | 1 | |
bordeaux.peerReviewed | oui | |
hal.identifier | hal-02636294 | |
hal.version | 1 | |
hal.popular | non | |
hal.audience | Internationale | |
hal.origin.link | https://hal.archives-ouvertes.fr//hal-02636294v1 | |
bordeaux.COinS | ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.jtitle=BMC%20Research%20Notes&rft.date=2016&rft.volume=9&rft.issue=1&rft.spage=1-9&rft.epage=1-9&rft.eissn=1756-0500&rft.issn=1756-0500&rft.au=ALUOME,%20Christelle&AUBERT,%20Gregoire&ALVES%20CARVALHO,%20Susete&LE%20PASLIER,%20Marie-Christine&BURSTIN,%20Judith&rft.genre=article |
Fichier(s) constituant ce document
Fichiers | Taille | Format | Vue |
---|---|---|---|
Il n'y a pas de fichiers associés à ce document. |