Recent development of high-resolution mass spectrometry (MS) instruments enables chemical crosslinking (XL) to become a high-throughput method for obtaining structural information about proteins. Restraints derived from XL-MS experiments have been used successfully for structure refinement and protein-protein docking. However, one formidable question is under which circumstances XL-MS data might be sufficient to determine a protein's tertiary structure de novo? Answering this question will not only include understanding the impact of XL-MS data on sampling and scoring within a de novo protein structure prediction algorithm, it must also determine an optimal crosslinker type and length for protein structure determination. While a longer crosslinker will yield more restraints, the value of each restraint for protein structure prediction decreases as the restraint is consistent with a larger conformational space. In this study, the number of crosslinks and their discriminative power was systematically analyzed in silico on a set of 2055 non-redundant protein folds considering Lys-Lys, Lys-Asp, Lys-Glu, Cys-Cys, and Arg-Arg reactive crosslinkers between 1 and 60Å. Depending on the protein size a heuristic was developed that determines the optimal crosslinker length. Next, simulated restraints of variable length were used to de novo predict the tertiary structure of fifteen proteins using the BCL::Fold algorithm. The results demonstrate that a distinct crosslinker length exists for which information content for de novo protein structure prediction is maximized. The sampling accuracy improves on average by 1.0 Å and up to 2.2 Å in the most prominent example. XL-MS restraints enable consistently an improved selection of native-like models with an average enrichment of 2.1.
Copyright © 2015. Published by Elsevier Inc.