Adapting learning and search algorithms to handle protein structural data with the goal of aiding drug discovery

Conev, Anja

Adapting learning and search algorithms to handle protein structural data with the goal of aiding drug discovery

Date

2024-09-16

Authors

Conev, Anja

Embargo

Abstract

Experimental methods for protein structure determination (e.g., x-ray crystallography, NMR, cryoEM) require access to expensive equipment and are not scalable. Computational methods assist protein structure prediction and analysis on a far larger scale. Recent deep learning advances, the most notable being DeepMind’s AlphaFold2.0 release in 2021, have provided a wealth of structural data for further analysis and open new opportunities for algorithmic development. In my work, I address three different tasks that make use of the available protein structure data: (1) system-specific binding-affinity prediction (in the context of the immune-related peptide-HLA system); (2) generation of representative ensembles from generic protein structure datasets; (3) protein-ligand ensemble docking. To this end, I examine and adapt a range of algorithms including random forest regression models, unsupervised learning methods and stochastic global optimization techniques. I validate the resulting pipelines on available experimental data and apply them to different macromolecular contexts such as the immune-related formation of the peptide-HLA complex; flexibility of the signal transducer PI3K lipid kinase; CDK2 protein kinase and estrogen receptor α. Developed pipelines are open source and freely available and can help guide the search for novel therapeutics.

Advisor

Kavraki, Lydia E

Degree

Doctor of Philosophy

Type

Thesis

Keywords

Protein structure, machine learning, unsupervised learning, peptide-HLA, molecular docking

Citable link to this page

https://hdl.handle.net/1911/118187

Collections

Rice University Theses and Dissertations

Full item page