DYRKA, Witold; NEBEL, Jean‐christophe; KOTULSKA, Malgorzata

doi:10.1186/1748-7188-8-31

The system will be going down for regular maintenance. Please save your work and logout.

hal.structure.identifier	Institute of Biomedical Engineering and Instrumentation
hal.structure.identifier	Models and Algorithms for the Genome [MAGNOME]
dc.contributor.author	DYRKA, Witold
hal.structure.identifier	Faculty of Science
dc.contributor.author	NEBEL, Jean‐christophe
hal.structure.identifier	Institute of Biomedical Engineering and Instrumentation
dc.contributor.author	KOTULSKA, Malgorzata
dc.date.accessioned	2024-04-15T09:42:02Z
dc.date.available	2024-04-15T09:42:02Z
dc.date.issued	2013
dc.identifier.issn	1748-7188
dc.identifier.uri	https://oskar-bordeaux.fr/handle/20.500.12278/197649
dc.description.abstractEn	Background<br />Hidden Markov Models power many state‐of‐the‐art tools in the field of protein bioinformatics. While excelling in their tasks, these methods of protein analysis do not convey directly information on medium‐ and long‐range residue‐residue interactions. This requires an expressive power of at least context‐free grammars. However, application of more powerful grammar formalisms to protein analysis has been surprisingly limited.<br />Results<br />In this work, we present a probabilistic grammatical framework for problem‐specific protein languages and apply it to classification of transmembrane helix‐helix pairs configurations. The core of the model consists of a probabilistic context‐free grammar, automatically inferred by a genetic algorithm from only a generic set of expert‐based rules and positive training samples. The model was applied to produce sequence based descriptors of four classes of transmembrane helix‐helix contact site configurations. The highest performance of the classifiers reached A U C R O C of 0.70. The analysis of grammar parse trees revealed the ability of representing structural features of helix‐helix contact sites.<br />Conclusions<br />We demonstrated that our probabilistic context‐free framework for analysis of protein sequences outperforms the state of the art in the task of helix‐helix contact site classification. However, this is achieved without necessarily requiring modeling long range dependencies between interacting residues. A significant feature of our approach is that grammar rules and parse trees are human‐readable. Thus they could provide biologically meaningful information for molecular biologists.
dc.language.iso	en
dc.publisher	BioMed Central
dc.subject.en	Probabilistic context-free grammar
dc.subject.en	Grammar inference
dc.subject.en	Genetic algorithm
dc.subject.en	Helix-helix contact
dc.subject.en	Protein structure prediction
dc.title.en	Probabilistic grammatical model for helix‐helix contact site classification
dc.type	Article de revue
dc.identifier.doi	10.1186/1748-7188-8-31
dc.subject.hal	Sciences du Vivant [q-bio]/Biochimie, Biologie Moléculaire/Biologie moléculaire
bordeaux.journal	Algorithms for Molecular Biology
bordeaux.page	31
bordeaux.volume	8
bordeaux.hal.laboratories	Laboratoire Bordelais de Recherche en Informatique (LaBRI) - UMR 5800	*
bordeaux.issue	1
bordeaux.institution	Université de Bordeaux
bordeaux.institution	Bordeaux INP
bordeaux.institution	CNRS
bordeaux.peerReviewed	oui
hal.identifier	hal-00925929
hal.version	1
hal.popular	non
hal.audience	Internationale
hal.origin.link	https://hal.archives-ouvertes.fr//hal-00925929v1
bordeaux.COinS	ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.jtitle=Algorithms%20for%20Molecular%20Biology&rft.date=2013&rft.volume=8&rft.issue=1&rft.spage=31&rft.epage=31&rft.eissn=1748-7188&rft.issn=1748-7188&rft.au=DYRKA,%20Witold&NEBEL,%20Jean%E2%80%90christophe&KOTULSKA,%20Malgorzata&rft.genre=article

Files in this item

Files	Size	Format	View
There are no files associated with this item.

This item appears in the following Collection(s)

Laboratoire Bordelais de Recherche en Informatique (LaBRI) - UMR 5800

Show simple item record

Probabilistic grammatical model for helix‐helix contact site classification

Files in this item

This item appears in the following Collection(s)