Predicting protein-ligand interactions from primary structure

Date
2002-02-15
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract

One of the key challenges in the post-genomic era is to understand protein-ligand interactions on a large scale. The question is: Given the primary structures of a protein and a ligand, how well can we computationally predict whether the ligand will bind to the protein? Wet laboratory experiments using combinatorial peptide screens and phage display techniques have yielded positive and negative examples of protein-ligand binding(Sparks, Zucconi, Alexandropoulos). In this paper, we model the prediction of protein-ligand interactions from primary structure as a classification problem and train naive Bayes classifiers (Mitchell) to distinguish between positive and negative examples of protein-ligand interactions. Such a predictive model can screen large numbers of potential ligands and save laboratory time and costs. We demonstrate the power of our approach in predicting interactions between SH3 domains and proline-rich ligands. We use laboratory data gathered from combinatorial peptide library screening (Sparks) of 8 diverse SH3 domains to construct a body of positive and negative examples. We learn naive Bayes models of ligand binding specificity of these SH3 domains and test them using across-validation approach. The models have prediction accuracies of 90% and higher with low false positive and negative rates. In addition, we visualize our classification model to reveal sites on both the ligand and the SH3 domain that contribute to the interaction. We use our classifiers to screen PxxPligands from Swissprot for given SH3 domains. Over 80% of these ligands are eliminated by our naive Bayes classifiers for 5 of the 8 SH3 domains considered in this paper.

Description
Advisor
Degree
Type
Technical report
Keywords
Citation

Bandyopadhyay, Raj, Matthews, K, Subramanian, D, et al.. "Predicting protein-ligand interactions from primary structure." (2002) https://hdl.handle.net/1911/96294.

Has part(s)
Forms part of
Published Version
Rights
You are granted permission for the noncommercial reproduction, distribution, display, and performance of this technical report in any format, but this permission is only for a period of forty-five (45) days from the most recent time that you verified that this technical report is still available from the Computer Science Department of Rice University under terms that include this permission. All other rights are reserved by the author(s).
Link to license
Citable link to this page