Computer Science Publications
Permanent URI for this collection
Browse
Browsing Computer Science Publications by Author "Abella, Jayvee R."
Now showing 1 - 7 of 7
Results Per Page
Sort Options
Item APE-Gen: A Fast Method for Generating Ensembles of Bound Peptide-MHC Conformations(MDPI, 2019) Abella, Jayvee R.; Antunes, Dinler A.; Clementi, Cecilia; Kavraki, Lydia E.The Class I Major Histocompatibility Complex (MHC) is a central protein in immunology as it binds to intracellular peptides and displays them at the cell surface for recognition by T-cells. The structural analysis of bound peptide-MHC complexes (pMHCs) holds the promise of interpretable and general binding prediction (i.e., testing whether a given peptide binds to a given MHC). However, structural analysis is limited in part by the difficulty in modelling pMHCs given the size and flexibility of the peptides that can be presented by MHCs. This article describes APE-Gen (Anchored Peptide-MHC Ensemble Generator), a fast method for generating ensembles of bound pMHC conformations. APE-Gen generates an ensemble of bound conformations by iterated rounds of (i) anchoring the ends of a given peptide near known pockets in the binding site of the MHC, (ii) sampling peptide backbone conformations with loop modelling, and then (iii) performing energy minimization to fix steric clashes, accumulating conformations at each round. APE-Gen takes only minutes on a standard desktop to generate tens of bound conformations, and we show the ability of APE-Gen to sample conformations found in X-ray crystallography even when only sequence information is used as input. APE-Gen has the potential to be useful for its scalability (i.e., modelling thousands of pMHCs or even non-canonical longer peptides) and for its use as a flexible search tool. We demonstrate an example for studying cross-reactivity.Item HLA-Arena: A Customizable Environment for the Structural Modeling and Analysis of Peptide-HLA Complexes for Cancer Immunotherapy(ASCO, 2020) Antunes, Dinler A.; Abella, Jayvee R.; Hall-Swan, Sarah; Devaurs, Didier; Conev, Anja; Moll, Mark; Lizée, Gregory; Kavraki, Lydia E.PURPOSE: HLA protein receptors play a key role in cellular immunity. They bind intracellular peptides and display them for recognition by T-cell lymphocytes. Because T-cell activation is partially driven by structural features of these peptide-HLA complexes, their structural modeling and analysis are becoming central components of cancer immunotherapy projects. Unfortunately, this kind of analysis is limited by the small number of experimentally determined structures of peptide-HLA complexes. Overcoming this limitation requires developing novel computational methods to model and analyze peptide-HLA structures. METHODS: Here we describe a new platform for the structural modeling and analysis of peptide-HLA complexes, called HLA-Arena, which we have implemented using Jupyter Notebook and Docker. It is a customizable environment that facilitates the use of computational tools, such as APE-Gen and DINC, which we have previously applied to peptide-HLA complexes. By integrating other commonly used tools, such as MODELLER and MHCflurry, this environment includes support for diverse tasks in structural modeling, analysis, and visualization. RESULTS: To illustrate the capabilities of HLA-Arena, we describe 3 example workflows applied to peptide-HLA complexes. Leveraging the strengths of our tools, DINC and APE-Gen, the first 2 workflows show how to perform geometry prediction for peptide-HLA complexes and structure-based binding prediction, respectively. The third workflow presents an example of large-scale virtual screening of peptides for multiple HLA alleles. CONCLUSION: These workflows illustrate the potential benefits of HLA-Arena for the structural modeling and analysis of peptide-HLA complexes. Because HLA-Arena can easily be integrated within larger computational pipelines, we expect its potential impact to vastly increase. For instance, it could be used to conduct structural analyses for personalized cancer immunotherapy, neoantigen discovery, or vaccine development.Item Large-Scale Structure-Based Prediction of Stable Peptide Binding to Class I HLAs Using Random Forests(Frontiers, 2020) Abella, Jayvee R.; Antunes, Dinler A.; Clementi, Cecilia; Kavraki, Lydia E.; Center for Theoretical Biological PhysicsPrediction of stable peptide binding to Class I HLAs is an important component for designing immunotherapies. While the best performing predictors are based on machine learning algorithms trained on peptide-HLA (pHLA) sequences, the use of structure for training predictors deserves further exploration. Given enough pHLA structures, a predictor based on the residue-residue interactions found in these structures has the potential to generalize for alleles with little or no experimental data. We have previously developed APE-Gen, a modeling approach able to produce pHLA structures in a scalable manner. In this work we use APE-Gen to model over 150,000 pHLA structures, the largest dataset of its kind, which were used to train a structure-based pan-allele model. We extract simple, homogenous features based on residue-residue distances between peptide and HLA, and build a random forest model for predicting stable pHLA binding. Our model achieves competitive AUROC values on leave-one-allele-out validation tests using significantly less data when compared to popular sequence-based methods. Additionally, our model offers an interpretation analysis that can reveal how the model composes the features to arrive at any given prediction. This interpretation analysis can be used to check if the model is in line with chemical intuition, and we showcase particular examples. Our work is a significant step toward using structure to achieve generalizable and more interpretable prediction for stable pHLA binding.Item Maintaining and Enhancing Diversity of Sampled Protein Conformations in Robotics-Inspired Methods(Mary Ann Liebert, Inc., 2018) Abella, Jayvee R.; Moll, Mark; Kavraki, Lydia E.The ability to efficiently sample structurally diverse protein conformations allows one to gain a high-level view of a protein's energy landscape. Algorithms from robot motion planning have been used for conformational sampling, and several of these algorithms promote diversity by keeping track of "coverage" in conformational space based on the local sampling density. However, large proteins present special challenges. In particular, larger systems require running many concurrent instances of these algorithms, but these algorithms can quickly become memory intensive because they typically keep previously sampled conformations in memory to maintain coverage estimates. In addition, robotics-inspired algorithms depend on defining useful perturbation strategies for exploring the conformational space, which is a difficult task for large proteins because such systems are typically more constrained and exhibit complex motions. In this article, we introduce two methodologies for maintaining and enhancing diversity in robotics-inspired conformational sampling. The first method addresses algorithms based on coverage estimates and leverages the use of a low-dimensional projection to define a global coverage grid that maintains coverage across concurrent runs of sampling. The second method is an automatic definition of a perturbation strategy through readily available flexibility information derived from B-factors, secondary structure, and rigidity analysis. Our results show a significant increase in the diversity of the conformations sampled for proteins consisting of up to 500 residues when applied to a specific robotics-inspired algorithm for conformational sampling. The methodologies presented in this article may be vital components for the scalability of robotics-inspired approaches.Item Quantitative comparison of adaptive sampling methods for protein dynamics(AIP Publishing LLC, 2018) Hruska, Eugen; Abella, Jayvee R.; Nüske, Feliks; Kavraki, Lydia E.; Clementi, Cecilia; Center for Theoretical Biological PhysicsAdaptive sampling methods, often used in combination with Markov state models, are becoming increasingly popular for speeding up rare events in simulation such as molecular dynamics (MD) without biasing the system dynamics. Several adaptive sampling strategies have been proposed, but it is not clear which methods perform better for different physical systems. In this work, we present a systematic evaluation of selected adaptive sampling strategies on a wide selection of fast folding proteins. The adaptive sampling strategies were emulated using models constructed on already existing MD trajectories. We provide theoretical limits for the sampling speed-up and compare the performance of different strategies with and without using some a priori knowledge of the system. The results show that for different goals, different adaptive sampling strategies are optimal. In order to sample slow dynamical processes such as protein folding without a prioriknowledge of the system, a strategy based on the identification of a set of metastable regions is consistently the most efficient, while a strategy based on the identification of microstates performs better if the goal is to explore newer regions of the conformational space. Interestingly, the maximum speed-up achievable for the adaptive sampling of slow processes increases for proteins with longer folding times, encouraging the application of these methods for the characterization of slower processes, beyond the fast-folding proteins considered here.Item Structural Modeling and Molecular Dynamics of the Immune Checkpoint Molecule HLA-G(Frontiers, 2020) Arns, Thais; Antunes, Dinler A.; Abella, Jayvee R.; Rigo, Maurício M.; Kavraki, Lydia E.; Giuliatti, Silvana; Donadi, Eduardo A.HLA-G is considered to be an immune checkpoint molecule, a function that is closely linked to the structure and dynamics of the different HLA-G isoforms. Unfortunately, little is known about the structure and dynamics of these isoforms. For instance, there are only seven crystal structures of HLA-G molecules, being all related to a single isoform, and in some cases lacking important residues associated to the interaction with leukocyte receptors. In addition, they lack information on the dynamics of both membrane-bound HLA-G forms, and soluble forms. We took advantage of in silico strategies to disclose the dynamic behavior of selected HLA-G forms, including the membrane-bound HLA-G1 molecule, soluble HLA-G1 dimer, and HLA-G5 isoform. Both the membrane-bound HLA-G1 molecule and the soluble HLA-G1 dimer were quite stable. Residues involved in the interaction with ILT2 and ILT4 receptors (α3 domain) were very close to the lipid bilayer in the complete HLA-G1 molecule, which might limit accessibility. On the other hand, these residues can be completely exposed in the soluble HLA-G1 dimer, due to the free rotation of the disulfide bridge (Cys42/Cys42). In fact, we speculate that this free rotation of each protomer (i.e., the chains composing the dimer) could enable alternative binding modes for ILT2/ILT4 receptors, which in turn could be associated with greater affinity of the soluble HLA-G1 dimer. Structural analysis of the HLA-G5 isoform demonstrated higher stability for the complex containing the peptide and coupled β2-microglobulin, while structures lacking such domains were significantly unstable. This study reports for the first time structural conformations for the HLA-G5 isoform and the dynamic behavior of HLA-G1 molecules under simulated biological conditions. All modeled structures were made available through GitHub (https://github.com/KavrakiLab/), enabling their use as templates for modeling other alleles and isoforms, as well as for other computational analyses to investigate key molecular interactions.Item Structure-based Methods for Binding Mode and Binding Affinity Prediction for Peptide-MHC Complexes(Bentham Science, 2018) Antunes, Dinler A.; Abella, Jayvee R.; Devaurs, Didier; Rigo, Maurício M.; Kavraki, Lydia E.Understanding the mechanisms involved in the activation of an immune response is essential to many fields in human health, including vaccine development and personalized cancer immunotherapy. A central step in the activation of the adaptive immune response is the recognition, by T-cell lymphocytes, of peptides displayed by a special type of receptor known as Major Histocompatibility Complex (MHC). Considering the key role of MHC receptors in T-cell activation, the computational prediction of peptide binding to MHC has been an important goal for many immunological applications. Sequence- based methods have become the gold standard for peptide-MHC binding affinity prediction, but structure-based methods are expected to provide more general predictions (i.e., predictions applicable to all types of MHC receptors). In addition, structural modeling of peptide-MHC complexes has the potential to uncover yet unknown drivers of T-cell activation, thus allowing for the development of better and safer therapies. In this review, we discuss the use of computational methods for the structural modeling of peptide-MHC complexes (i.e., binding mode prediction) and for the structure-based prediction of binding affinity.