Browsing by Author "Moll, Mark"
Now showing 1 - 19 of 19
Results Per Page
Sort Options
Item A review of parameters and heuristics for guiding metabolic pathfinding(Springer International Publishing, 2017-09-15) Kim, Sarah M.; Peña, Matthew I.; Moll, Mark; Bennett, George N.; Kavraki, Lydia E.Abstract Recent developments in metabolic engineering have led to the successful biosynthesis of valuable products, such as the precursor of the antimalarial compound, artemisinin, and opioid precursor, thebaine. Synthesizing these traditionally plant-derived compounds in genetically modified yeast cells introduces the possibility of significantly reducing the total time and resources required for their production, and in turn, allows these valuable compounds to become cheaper and more readily available. Most biosynthesis pathways used in metabolic engineering applications have been discovered manually, requiring a tedious search of existing literature and metabolic databases. However, the recent rapid development of available metabolic information has enabled the development of automated approaches for identifying novel pathways. Computer-assisted pathfinding has the potential to save biochemists time in the initial discovery steps of metabolic engineering. In this paper, we review the parameters and heuristics used to guide the search in recent pathfinding algorithms. These parameters and heuristics capture information on the metabolic network structure, compound structures, reaction features, and organism-specificity of pathways. No one metabolic pathfinding algorithm or search parameter stands out as the best to use broadly for solving the pathfinding problem, as each method and parameter has its own strengths and shortcomings. As assisted pathfinding approaches continue to become more sophisticated, the development of better methods for visualizing pathway results and integrating these results into existing metabolic engineering practices is also important for encouraging wider use of these pathfinding methods.Item Atlas + X: Sampling-based Planners on Constraint Manifolds(2017-06-14) Voss, Caleb; Moll, Mark; Kavraki, Lydia E.Sampling-based planners struggle when the valid configurations are constrained to an implicit manifold. Special planners have been proposed for this problem recently. Our new framework is decoupled from any particular planner and augments existing algorithms not explicitly designed for constraint planning. We demonstrate the advantages of our generalized approach.Item Coarse-Grained Conformational Sampling of Protein Structure Improves the Fit to Experimental Hydrogen-Exchange Data(Frontiers Media S.A., 2017) Devaurs, Didier; Antunes, Dinler A.; Papanastasiou, Malvina; Moll, Mark; Ricklin, Daniel; Lambris, John D.; Kavraki, Lydia E.Monitoring hydrogen/deuterium exchange (HDX) undergone by a protein in solution produces experimental data that translates into valuable information about the protein's structure. Data produced by HDX experiments is often interpreted using a crystal structure of the protein, when available. However, it has been shown that the correspondence between experimental HDX data and crystal structures is often not satisfactory. This creates difficulties when trying to perform a structural analysis of the HDX data. In this paper, we evaluate several strategies to obtain a conformation providing a good fit to the experimental HDX data, which is a premise of an accurate structural analysis. We show that performing molecular dynamics simulations can be inadequate to obtain such conformations, and we propose a novel methodology involving a coarse-grained conformational sampling approach instead. By extensively exploring the intrinsic flexibility of a protein with this approach, we produce a conformational ensemble from which we extract aᅠsingleᅠconformation providing a good fit to the experimental HDX data. We successfully demonstrate the applicability of our method to four small and medium-sized proteins.Item Combinatorial Clustering of Residue Position Subsets Predicts Inhibitor Affinity across the Human Kinome(Public Library of Science, 2013) Bryant, Drew H.; Moll, Mark; Finn, Paul W.; Kavraki, Lydia E.The protein kinases are a large family of enzymes that play fundamental roles in propagating signals within the cell. Because of the high degree of binding site similarity shared among protein kinases, designing drug compounds with high specificity among the kinases has proven difficult. However, computational approaches to comparing the 3-dimensional geometry and physicochemical properties of key binding site residue positions have been shown to be informative of inhibitor selectivity. The Combinatorial Clustering Of Residue Position Subsets (CCORPS) method, introduced here, provides a semi-supervised learning approach for identifying structural features that are correlated with a given set of annotation labels. Here, CCORPS is applied to the problem of identifying structural features of the kinase ATP binding site that are informative of inhibitor binding. CCORPS is demonstrated to make perfect or near-perfect predictions for the binding affinity profile of 8 of the 38 kinase inhibitors studied, while only having overall poor predictive ability for 1 of the 38 compounds. Additionally, CCORPS is shown to identify shared structural features across phylogenetically diverse groups of kinases that are correlated with binding affinity for particular inhibitors; such instances of structural similarity among phylogenetically diverse kinases are also shown to not be rare among kinases. Finally, these function-specific structural features may serve as potential starting points for the development of highly specific kinase inhibitors.Item Combining Sampling and Optimizing for Robotic Path Planning(2018-09-12) Willey, Bryce Steven; Kavraki, Lydia E.; Moll, MarkRobotic path planning is a critical problem in autonomous robotics. Two com- mon approaches to robotic path planning are sampling-based motion planners and continuous optimization methods. Sampling-based motion planners explore the search space effectively, but either return low quality paths or take a long time to ini- tially find a path. Continuous optimization methods quickly find high-quality paths, but often return paths in collision with obstacles. This thesis combines sampling- based and continuous optimization techniques in order to improve the performance of these planning approaches. This thesis shows that the advantages and disad- vantages of these approaches are complementary and proposes combining them into a pipeline. The proposed pipeline results in better path quality than either ap- proach alone, providing a robust, efficient, and high-quality general path planning solution. The use of collision checking techniques introduced by continuous opti- mization methods in sampling-based planners is also analyzed and approximation error rates and timing results are provided.Item DINC 2.0: A New Protein–Peptide Docking Webserver Using an Incremental Approach(AACR, 2017) Antunes, Dinler A.; Moll, Mark; Devaurs, Didier; Jackson, Kyle; Lizée, Gregory; Kavraki, Lydia E.Molecular docking is a standard computational approach to predict binding modes of protein–ligand complexes by exploring alternative orientations and conformations of the ligand (i.e., by exploring ligand flexibility). Docking tools are largely used for virtual screening of small drug-like molecules, but their accuracy and efficiency greatly decays for ligands with more than 10 flexible bonds. This prevents a broader use of these tools to dock larger ligands, such as peptides, which are molecules of growing interest in cancer research. To overcome this limitation, our group has previously proposed a meta-docking strategy, called DINC, to predict binding modes of large ligands. By incrementally docking overlapping fragments of a ligand, DINC allowed predicting binding modes of peptide-based inhibitors of transcription factors involved in cancer. Here, we describe DINC 2.0, a revamped version of the DINC webserver with enhanced capabilities and a more user-friendly interface. DINC 2.0 allows docking ligands that were previously too challenging for DINC, such as peptides with more than 25 flexible bonds. The webserver is freely accessible at http://dinc.kavrakilab.org, together with additional documentation and video tutorials. Our team will provide continuous support for this tool and is working on extending its applicability to other challenging fields, such as personalized immunotherapy against cancer.Item Examining the Use of Homology Models in Predicting Kinase Binding Affinity(2013-12-05) Chyan, Jeffrey; Kavraki, Lydia E.; Nakhleh, Luay K.; Jermaine, Christopher M.; Moll, MarkDrug design is a difficult and multi-faceted problem that has led to extensive interdiscplinary work in the field of computational biology. In recent years, several computational methods have emerged. The overall goal of computational algorithms is to narrow down the number of leads that will be further considered for laboratory experimentation and clinical studies. Much of current drug design focuses on a family of proteins called kinases because they play a pivotal role in many of the cell signaling pathways in the human body. Drugs need to be designed such that they bind to specific kinases in the human kinome inhibiting kinase functions that can be causing various diseases such as cancer. It is important for drugs to have high specificity inhibiting only certain kinases avoiding undesirable effects on the human body. Computational prediction methods can accomplish this complex task by doing a comparative analysis on the binding site of kinases both in sequence and structure to predict binding affinity with potential drugs. However, computational methods depend on existing protein data to make predictions. There is a lack of structural protein data relative to known proteins and protein sequences. A potential solution to the the lack of information is to use computationally generated structural data called homology models. This thesis introduces a framework for the integration of homology models with CCORPS, a semi-supervised learning method that identifies structural features in proteins that correlate with protein function. We discuss the effects of using homology models to supplement existing experimental structural data for kinases to predict the binding affinity of kinases with various drugs in our experiments. While the work in this thesis focuses on predicting kinase binding affinity, the framework can be generalized showing the potential of using CCORPS with computationally generated data when there is a lack of experimental data.Item General Prediction of Peptide-MHC Binding Modes Using Incremental Docking: A Proof of Concept(Springer Nature, 2018) Antunes, Dinler A.; Devaurs, Didier; Moll, Mark; Lizée, Gregory; Kavraki, Lydia E.The class I major histocompatibility complex (MHC) is capable of binding peptides derived from intracellular proteins and displaying them at the cell surface. The recognition of these peptide-MHC (pMHC) complexes by T-cells is the cornerstone of cellular immunity, enabling the elimination of infected or tumoral cells. T-cell-based immunotherapies against cancer, which leverage this mechanism, can greatly benefit from structural analyses of pMHC complexes. Several attempts have been made to use molecular docking for such analyses, but pMHC structure remains too challenging for even state-of-the-art docking tools. To overcome these limitations, we describe the use of an incremental meta-docking approach for structural prediction of pMHC complexes. Previous methods applied in this context used specific constraints to reduce the complexity of this prediction problem, at the expense of generality. Our strategy makes no assumption and can potentially be used to predict binding modes for any pMHC complex. Our method has been tested in a re-docking experiment, reproducing the binding modes of 25 pMHC complexes whose crystal structures are available. This study is a proof of concept that incremental docking strategies can lead to general geometry prediction of pMHC complexes, with potential applications for immunotherapy against cancer or infectious diseases.Item HLA-Arena: A Customizable Environment for the Structural Modeling and Analysis of Peptide-HLA Complexes for Cancer Immunotherapy(ASCO, 2020) Antunes, Dinler A.; Abella, Jayvee R.; Hall-Swan, Sarah; Devaurs, Didier; Conev, Anja; Moll, Mark; Lizée, Gregory; Kavraki, Lydia E.PURPOSE: HLA protein receptors play a key role in cellular immunity. They bind intracellular peptides and display them for recognition by T-cell lymphocytes. Because T-cell activation is partially driven by structural features of these peptide-HLA complexes, their structural modeling and analysis are becoming central components of cancer immunotherapy projects. Unfortunately, this kind of analysis is limited by the small number of experimentally determined structures of peptide-HLA complexes. Overcoming this limitation requires developing novel computational methods to model and analyze peptide-HLA structures. METHODS: Here we describe a new platform for the structural modeling and analysis of peptide-HLA complexes, called HLA-Arena, which we have implemented using Jupyter Notebook and Docker. It is a customizable environment that facilitates the use of computational tools, such as APE-Gen and DINC, which we have previously applied to peptide-HLA complexes. By integrating other commonly used tools, such as MODELLER and MHCflurry, this environment includes support for diverse tasks in structural modeling, analysis, and visualization. RESULTS: To illustrate the capabilities of HLA-Arena, we describe 3 example workflows applied to peptide-HLA complexes. Leveraging the strengths of our tools, DINC and APE-Gen, the first 2 workflows show how to perform geometry prediction for peptide-HLA complexes and structure-based binding prediction, respectively. The third workflow presents an example of large-scale virtual screening of peptides for multiple HLA alleles. CONCLUSION: These workflows illustrate the potential benefits of HLA-Arena for the structural modeling and analysis of peptide-HLA complexes. Because HLA-Arena can easily be integrated within larger computational pipelines, we expect its potential impact to vastly increase. For instance, it could be used to conduct structural analyses for personalized cancer immunotherapy, neoantigen discovery, or vaccine development.Item How Much Do Unstated Problem Constraints Limit Deep Robotic Reinforcement Learning?(2019) Lewis, W. Cannon II; Moll, Mark; Kavraki, Lydia E.Deep Reinforcement Learning is a promising paradigm for robotic control which has been shown to be capable of learning policies for high-dimensional, continuous control of unmodeled systems. However, Robotic Reinforcement Learning currently lacks clearly defined benchmark tasks, which makes it difficult for researchers to reproduce and compare against prior work. “Reacher” tasks, which are fundamental to robotic manipulation, are commonly used as benchmarks, but the lack of a formal specification elides details that are crucial to replication. In this paper we present a novel empirical analysis which shows that the unstated spatial constraints in commonly used implementations of Reacher tasks make it dramatically easier to learn a successful control policy with Deep Deterministic Policy Gradients (DDPG), a state-of-the-art Deep RL algorithm. Our analysis suggests that less constrained Reacher tasks are significantly more difficult to learn, and hence that existing de facto benchmarks are not representative of the difficulty of general robotic manipulation.Item Maintaining and Enhancing Diversity of Sampled Protein Conformations in Robotics-Inspired Methods(Mary Ann Liebert, Inc., 2018) Abella, Jayvee R.; Moll, Mark; Kavraki, Lydia E.The ability to efficiently sample structurally diverse protein conformations allows one to gain a high-level view of a protein's energy landscape. Algorithms from robot motion planning have been used for conformational sampling, and several of these algorithms promote diversity by keeping track of "coverage" in conformational space based on the local sampling density. However, large proteins present special challenges. In particular, larger systems require running many concurrent instances of these algorithms, but these algorithms can quickly become memory intensive because they typically keep previously sampled conformations in memory to maintain coverage estimates. In addition, robotics-inspired algorithms depend on defining useful perturbation strategies for exploring the conformational space, which is a difficult task for large proteins because such systems are typically more constrained and exhibit complex motions. In this article, we introduce two methodologies for maintaining and enhancing diversity in robotics-inspired conformational sampling. The first method addresses algorithms based on coverage estimates and leverages the use of a low-dimensional projection to define a global coverage grid that maintains coverage across concurrent runs of sampling. The second method is an automatic definition of a perturbation strategy through readily available flexibility information derived from B-factors, secondary structure, and rigidity analysis. Our results show a significant increase in the diversity of the conformations sampled for proteins consisting of up to 500 residues when applied to a specific robotics-inspired algorithm for conformational sampling. The methodologies presented in this article may be vital components for the scalability of robotics-inspired approaches.Item Modeling 3D Minimal-Energy Curves of Given Length(2005-01-04) Kavraki, Lydia E.; Moll, MarkWe present a subdivision scheme for the construction of 3D minimal-energy curves of given length that satisfy endpoint constraints. When given desired positions and tangents for the endpoints, and the length of the curve, the scheme iteratively builds up a minimal-energy curve. During each iteration the algorithm solves a low-dimensional optimization problem, whereby the energy of the curve is lowered and at the same time the endpoint constraints are satisfied. The energy of the curve is defined as the integral of the curvature squared and the torsion squared. With this energy function, minimal-energy curves correspond to stable configurations of flexible inextensible wires. A curve is represented by segments of piecewise constant curvature and torsion. The representation is adaptive in the sense that the number of parameters automatically varies with the complexity of the underlying curve. This scheme has been implemented and simulation results show that it typically quickly converges to very smooth curves. Our minimal-energy curve framework can be extended to minimal-energy curves of fixed length that pass through several control points and tangents. This work has applications in modeling flexible inextensible wires such as surgical sutures.Item Motion Planning with Uncertain Information in Robotic Tasks(2014-03-25) Grady, Devin; Kavraki, Lydia E.; McLurkin, James; Moll, Mark; O'Malley, Marcia K.In the real world, robots operate with imperfect sensors providing uncertain and incomplete information. We develop techniques to solve motion planning problems with imperfect information in order to accomplish a variety of robotic tasks including navigation, search-and-rescue, and exposure minimization. This thesis focuses on the challenge of creating robust policies for robots with imperfect actions and sensing. These policies map input observations to output actions. The tools that exist to solve these problems are typically Partially-Observable Markov Decision Processes (POMDPs), and can only handle small problem instances. This thesis proposes several techniques to expand the size of the problem instance that can be considered. Because executing a policy is simple once the offline computation is done, even inexpensive, computationally constrained robots can use these policies and solve the tasks mentioned. First we show that the solution of an abstracted action space can be used to bootstrap a complete solution for navigation. Generalizing this action space abstraction to both action and state spaces expands the set of problems that can be solved. Additionally, the concept of abstraction is applied to the workspace -- we develop a method to compute local solutions to a noisy navigation problem, then stitch them together into a global solution. Our proposed methods are run on large problem instances, and the output policies are compared against policies generated with existing techniques. Though these large tasks are often unsolvable with previous methods, abstraction allows us to find high quality policies. Our findings show that these techniques significantly increase the size of tasks involving planning with uncertain information for which solutions can be found. The techniques presented generally offer significant speed increases and often solution quality improvements as well. Additionally, this thesis includes work on two separate problems. First, we solve a task where several robots cooperate to quickly classify an observed object as one of several possible types using a camera. Then, we proceed to solve a task where a single robot navigates to a destination quickly, but the robot may need to allocate time towards obtaining information about a new object discovered along the way.Item On Flexible Docking Using Expansive Search(2005-02-22) Heath, Allison; Kavraki, Lydia E.; Moll, Mark; Schwarz, DavidThe activity of most drugs is regulated by the binding of one molecule(the ligand) to a pocket of another, usually larger, molecule, which is commonly a protein. This report describes a new approach to creating low-energy structures of flexible proteins to which ligands can be docked. The flexibility of molecules is encoded with thousands of parameters making the search for valid complexes a formidable problem. Our method takes into account the flexibility of the protein as this can be encoded by its major modes of motion. The output of the program consists of low-energy protein conformations that can then be docked with a ligand using a traditional docking program. We employ a robotics-based approach for exploring the conformational space of the protein. Our long term goal is to develop an efficient, accurate, and automated algorithm that will be used to screen large databases of molecules for novel therapeutics.Item Robonaut 2 and you: Specifying and executing complex operations(IEEE, 2017) Baker, William; Kingston, Zachary; Moll, Mark; Badger, Julia; Kavraki, LydiaCrew time is a precious resource due to the expense of trained human operators in space. Efficient caretaker robots could lessen the manual labor load required by frequent vehicular and life support maintenance tasks, freeing astronaut time for scientific mission objectives. Humanoid robots can fluidly exist alongside human counterparts due to their form, but they are complex and high-dimensional platforms. This paper describes a system that human operators can use to maneuver Robonaut 2 (R2), a dexterous humanoid robot developed by NASA to research co-robotic applications. The system includes a specification of constraints used to describe operations, and the supporting planning framework that solves constrained problems on R2 at interactive speeds. The paper is developed in reference to an illustrative, typical example of an operation R2 performs to highlight the challenges inherent to the problems R2 must face. Finally, the interface and planner is validated through a case-study using the guiding example on the physical robot in a simulated microgravity environment. This work reveals the complexity of employing humanoid caretaker robots and suggest solutions that are broadly applicable.Item SIMS: A Hybrid Method for Rapid Conformational Analysis(Public Library of Science, 2013) Gipson, Bryant; Moll, Mark; Kavraki, Lydia E.Proteins are at the root of many biological functions, often performing complex tasks as the result of large changes in their structure. Describing the exact details of these conformational changes, however, remains a central challenge for computational biology due the enormous computational requirements of the problem. This has engendered the development of a rich variety of useful methods designed to answer specific questions at different levels of spatial, temporal, and energetic resolution. These methods fall largely into two classes: physically accurate, but computationally demanding methods and fast, approximate methods. We introduce here a new hybrid modeling tool, the Structured Intuitive Move Selector (SIMS), designed to bridge the divide between these two classes, while allowing the benefits of both to be seamlessly integrated into a single framework. This is achieved by applying a modern motion planning algorithm, borrowed from the field of robotics, in tandem with a well-established protein modeling library. SIMS can combine precise energy calculations with approximate or specialized conformational sampling routines to produce rapid, yet accurate, analysis of the large-scale conformational variability of protein systems. Several key advancements are shown, including the abstract use of generically defined moves (conformational sampling methods) and an expansive probabilistic conformational exploration. We present three example problems that SIMS is applied to and demonstrate a rapid solution for each. These include the automatic determination of ムムactiveメメ residues for the hinge-based system Cyanovirin-N, exploring conformational changes involving long-range coordinated motion between non-sequential residues in Ribose- Binding Protein, and the rapid discovery of a transient conformational state of Maltose-Binding Protein, previously only determined by Molecular Dynamics. For all cases we provide energetic validations using well-established energy fields, demonstrating this framework as a fast and accurate tool for the analysis of a wide range of protein flexibility problems.Item Structure-guided selection of specificity determining positions in the human Kinome(BioMed Central, 2016) Moll, Mark; Finn, Paul W.; Kavraki, Lydia E.Abstract Background The human kinome contains many important drug targets. It is well-known that inhibitors of protein kinases bind with very different selectivity profiles. This is also the case for inhibitors of many other protein families. The increased availability of protein 3D structures has provided much information on the structural variation within a given protein family. However, the relationship between structural variations and binding specificity is complex and incompletely understood. We have developed a structural bioinformatics approach which provides an analysis of key determinants of binding selectivity as a tool to enhance the rational design of drugs with a specific selectivity profile. Results We propose a greedy algorithm that computes a subset of residue positions in a multiple sequence alignment such that structural and chemical variation in those positions helps explain known binding affinities. By providing this information, the main purpose of the algorithm is to provide experimentalists with possible insights into how the selectivity profile of certain inhibitors is achieved, which is useful for lead optimization. In addition, the algorithm can also be used to predict binding affinities for structures whose affinity for a given inhibitor is unknown. The algorithm’s performance is demonstrated using an extensive dataset for the human kinome. Conclusion We show that the binding affinity of 38 different kinase inhibitors can be explained with consistently high precision and accuracy using the variation of at most six residue positions in the kinome binding site. We show for several inhibitors that we are able to identify residues that are known to be functionally important.Item Synthesis of Integrated Task and Motion Plans from Plan Outlines Using SMT Solvers(2015-01-09) Chaudhuri, Swarat; Kavraki, Lydia E.; Moll, Mark; Nedunuri, Srinivas; Prabhu, Sailesh; Wang, YueWe present a new approach to integrated task and motion planning (ITMP) for robots performing mobile manipulation. In our approach, the user writes a high-level specification that captures partial knowledge about a mobile manipulation setting. In particular, this specification includes a plan outline that syntactically defines a space of plausible integrated plans, a set of logical requirements that the generated plan must satisfy, and a description of the physical space that the robot manipulates. A synthesis algorithm is now used to search for an integrated plan that falls within the space defined by the plan outline, and also satisfies all requirements. Our synthesis algorithm complements continuous motion planning algorithms with calls to a Satisfiability Modulo Theories (SMT) solver. From the scene description, a motion planning algorithm is used to construct a placement graph, an abstraction of a manipulation graph whose paths represent feasible, low-level motion plans. An SMT-solver is now used to symbolically explore the space of all integrated plans that correspond to paths in the placement graph, and also satisfy the constraints demanded by the plan outline and the requirements. Our approach is implemented in a system called ROBOSYNTH. We have evaluated ROBOSYNTH on a generalization of an ITMP problem investigated in prior work. The experiments demonstrate that our method is capable of generating integrated plans for a number of interesting variations on the problem.Item Using parallelized incremental meta-docking can solve the conformational sampling issue when docking large ligands to proteins(Biomed Central, 2019) Devaurs, Didier; Antunes, Dinler A.; Hall-Swan, Sarah; Mitchell, Nicole; Moll, Mark; Lizée, Gregory; Kavraki, Lydia E.Background: Docking large ligands, and especially peptides, to protein receptors is still considered a challenge in computational structural biology. Besides the issue of accurately scoring the binding modes of a protein-ligand complex produced by a molecular docking tool, the conformational sampling of a large ligand is also often considered a challenge because of its underlying combinatorial complexity. In this study, we evaluate the impact of using parallelized and incremental paradigms on the accuracy and performance of conformational sampling when docking large ligands. We use five datasets of protein-ligand complexes involving ligands that could not be accurately docked by classical protein-ligand docking tools in previous similar studies. Results: Our computational evaluation shows that simply increasing the amount of conformational sampling performed by a protein-ligand docking tool, such as Vina, by running it for longer is rarely beneficial. Instead, it is more efficient and advantageous to run several short instances of this docking tool in parallel and group their results together, in a straightforward parallelized docking protocol. Even greater accuracy and efficiency are achieved by our parallelized incremental meta-docking tool, DINC, showing the additional benefits of its incremental paradigm. Using DINC, we could accurately reproduce the vast majority of the protein-ligand complexes we considered. Conclusions: Our study suggests that, even when trying to dock large ligands to proteins, the conformational sampling of the ligand should no longer be considered an issue, as simple docking protocols using existing tools can solve it. Therefore, scoring should currently be regarded as the biggest unmet challenge in molecular docking.