BCL::contact-low confidence fold recognition hits boost protein contact prediction and de novo structure determination.

KarakaƟ M, Woetzel N, Meiler J
J Comput Biol. 2010 17 (2): 153-68

PMID: 19772383 · PMCID: PMC3148831 · DOI:10.1089/cmb.2009.0030

Knowledge of all residue-residue contacts within a protein allows determination of the protein fold. Accurate prediction of even a subset of long-range contacts (contacts between amino acids far apart in sequence) can be instrumental for determining tertiary structure. Here we present BCL::Contact, a novel contact prediction method that utilizes artificial neural networks (ANNs) and specializes in the prediction of medium to long-range contacts. BCL::Contact comes in two modes: sequence-based and structure-based. The sequence-based mode uses only sequence information and has individual ANNs specialized for helix-helix, helix-strand, strand-helix, strand-strand, and sheet-sheet contacts. The structure-based mode combines results from 32-fold recognition methods with sequence information to a consensus prediction. The two methods were presented in the 6(th) and 7(th) Critical Assessment of Techniques for Protein Structure Prediction (CASP) experiments. The present work focuses on elucidating the impact of fold recognition results onto contact prediction via a direct comparison of both methods on a joined benchmark set of proteins. The sequence-based mode predicted contacts with 42% accuracy (7% false positive rate), while the structure-based mode achieved 45% accuracy (2% false positive rate). Predictions by both modes of BCL::Contact were supplied as input to the protein tertiary structure prediction program Rosetta for a benchmark of 17 proteins with no close sequence homologs in the protein data bank (PDB). Rosetta created higher accuracy models, signified by an improvement of 1.3 A on average root mean square deviation (RMSD), when driven by the predicted contacts. Further, filtering Rosetta models by agreement with the predicted contacts enriches for native-like fold topologies.

MeSH Terms (9)

Algorithms Caspase 7 Computer Simulation Humans Models, Molecular Neural Networks (Computer) Protein Conformation Protein Folding Sequence Analysis, Protein

Connections (2)

This publication is referenced by other Labnodes entities: