Phylogenetic profiles for the prediction of protein-protein interactions: how to select reference organisms?

Sun J, Li Y, Zhao Z
Biochem Biophys Res Commun. 2007 353 (4): 985-91

PMID: 17207465 · DOI:10.1016/j.bbrc.2006.12.146

The phylogenetic profile method has been widely applied in the prediction of protein-protein interactions (PPIs). Studies often use all of the available complete genomes for this method. With more than 400 genomes complete and new ones on the horizon, it remains unclear how to select reference organisms for profile construction and then influence the PPI prediction. Here, we performed a systematic assessment of reference organism selection from 225 complete genomes with their evolutionary tree. Our results suggest that reference organisms should be selected from moderately and highly genetically distant organisms, from all three domains (Bacteria, Archaea, and Eukarya), and by their even distribution at the fifth hierarchical level in the evolutionary tree. Our study provides important guidance on the construction of phylogenetic profiles for PPI prediction and functional genomics, which has become challenging due to the large and increasing number of available candidate organisms.

MeSH Terms (13)

Algorithms Animals Archaeal Proteins Bacterial Proteins Computational Biology Databases, Genetic Eukaryotic Cells Genome Humans Phylogeny Protein Binding Protein Interaction Mapping Reproducibility of Results

