Adapting learning and search algorithms to handle protein structural data with the goal of aiding drug discovery
dc.contributor.advisor | Kavraki, Lydia E | en_US |
dc.creator | Conev, Anja | en_US |
dc.date.accessioned | 2025-01-16T20:16:07Z | en_US |
dc.date.created | 2024-12 | en_US |
dc.date.issued | 2024-09-16 | en_US |
dc.date.submitted | December 2024 | en_US |
dc.date.updated | 2025-01-16T20:16:07Z | en_US |
dc.description.abstract | Experimental methods for protein structure determination (e.g., x-ray crystallography, NMR, cryoEM) require access to expensive equipment and are not scalable. Computational methods assist protein structure prediction and analysis on a far larger scale. Recent deep learning advances, the most notable being DeepMind’s AlphaFold2.0 release in 2021, have provided a wealth of structural data for further analysis and open new opportunities for algorithmic development. In my work, I address three different tasks that make use of the available protein structure data: (1) system-specific binding-affinity prediction (in the context of the immune-related peptide-HLA system); (2) generation of representative ensembles from generic protein structure datasets; (3) protein-ligand ensemble docking. To this end, I examine and adapt a range of algorithms including random forest regression models, unsupervised learning methods and stochastic global optimization techniques. I validate the resulting pipelines on available experimental data and apply them to different macromolecular contexts such as the immune-related formation of the peptide-HLA complex; flexibility of the signal transducer PI3K lipid kinase; CDK2 protein kinase and estrogen receptor α. Developed pipelines are open source and freely available and can help guide the search for novel therapeutics. | en_US |
dc.embargo.lift | 2025-06-01 | en_US |
dc.embargo.terms | 2025-06-01 | en_US |
dc.format.mimetype | application/pdf | en_US |
dc.identifier.uri | https://hdl.handle.net/1911/118187 | en_US |
dc.language.iso | en | en_US |
dc.subject | Protein structure | en_US |
dc.subject | machine learning | en_US |
dc.subject | unsupervised learning | en_US |
dc.subject | peptide-HLA | en_US |
dc.subject | molecular docking | en_US |
dc.title | Adapting learning and search algorithms to handle protein structural data with the goal of aiding drug discovery | en_US |
dc.type | Thesis | en_US |
dc.type.material | Text | en_US |
thesis.degree.department | Computer Science | en_US |
thesis.degree.discipline | Computer Science | en_US |
thesis.degree.grantor | Rice University | en_US |
thesis.degree.level | Doctoral | en_US |
thesis.degree.name | Doctor of Philosophy | en_US |