Browsing by Author "Litsa, Eleni E."
Now showing 1 - 3 of 3
Results Per Page
Sort Options
Item Elucidating Metabolism through Machine Learning(2021-04-23) Litsa, Eleni E.; Kavraki, Lydia E.Metabolism consists of all chemical reactions that take place in an organism to sustain life. Metabolic studies have the potential to advance chemical synthesis and drug development, discovery of biomarkers and therapeutic targets, as well as, environmental management. Computational tools can greatly benefit metabolic studies as the standard experimental practices are often laborious and resource demanding. Existing computational approaches often rely on expert knowledge limiting scalability and generalizability. As the volume of the available metabolic data grows, Machine Learning (ML) is emerging as a promising tool to assist metabolic studies. The latest advancements in the field of ML and Deep Learning (DL), especially regarding structured data such as chemical molecules, are also pointing to the same direction. Metabolic data though are very scarce, as opposed to general chemical data, making the application of ML especially challenging. In this work, we have explored statistical ML methodologies as well as DL architectures. In the latter case, we have explored the use of Transfer Learning in an attempt to circumvent the limited data problem and also take advantage of the massive datasets on general chemical data. More specifically, we have developed ML-based approaches for three different problems to assist metabolic studies: The first problem is to automatically identify the reaction mechanism of a metabolic reaction in the form of an atom mapping between the atoms in the two sides of the reaction. This problem is approached as a graph matching problem, representing chemical molecules as graphs, which is solved using optimization algorithms. Our approach improved upon existing methodologies by incorporating chemical knowledge into the graph problem using statistical ML. The second problem is to predict human metabolites of chemical molecules such as drugs. We approached this problem as a sequence translation problem representing chemical molecules as sequences based on a standard sequence notation called SMILES. We used a neural Machine Translation algorithm to translate the sequence of the molecule into the metabolites that may be formed in the human body. Our end-to-end learning approach exhibits better scalability and generalizability as compared to previous rule-based methodologies. Finally, the third problem is to recommend chemical structures given mass spectrometry data in order to assist structure elucidation in metabolomics studies. We approached this problem as a signal translation problem where the signal that is recorded from the mass spectrometer is translated into the SMILES sequence of the chemical molecule using a DL architecture. Our approach is the first one that has the potential to aid the elucidation of even novel molecules whose structures are not known yet. Overall our work has demonstrated the potential of ML and DL to assist metabolic studies as well as the importance of Transfer Learning in domains with limited available data.Item Machine Learning-Guided Three-Dimensional Printing of Tissue Engineering Scaffolds(Mary Ann Liebert, Inc., 2020) Conev, Anja; Litsa, Eleni E.; Perez, Marissa R.; Diba, Mani; Mikos, Antonios G.; Kavraki, Lydia E.; Bioengineering; Computer Science; Center for Engineering Complex TissuesVarious material compositions have been successfully used in 3D printing with promising applications as scaffolds in tissue engineering. However, identifying suitable printing conditions for new materials requires extensive experimentation in a time and resource-demanding process. This study investigates the use of Machine Learning (ML) for distinguishing between printing configurations that are likely to result in low-quality prints and printing configurations that are more promising as a first step toward the development of a recommendation system for identifying suitable printing conditions. The ML-based framework takes as input the printing conditions regarding the material composition and the printing parameters and predicts the quality of the resulting print as either “low” or “high.” We investigate two ML-based approaches: a direct classification-based approach that trains a classifier to distinguish between low- and high-quality prints and an indirect approach that uses a regression ML model that approximates the values of a printing quality metric. Both modes are built upon Random Forests. We trained and evaluated the models on a dataset that was generated in a previous study, which investigated fabrication of porous polymer scaffolds by means of extrusion-based 3D printing with a full-factorial design. Our results show that both models were able to correctly label the majority of the tested configurations while a simpler linear ML model was not effective. Additionally, our analysis showed that a full factorial design for data collection can lead to redundancies in the data, in the context of ML, and we propose a more efficient data collection strategy.Item Prediction of drug metabolites using neural machine translation(Royal Society of Chemistry, 2020) Litsa, Eleni E.; Das, Payel; Kavraki, Lydia E.Metabolic processes in the human body can alter the structure of a drug affecting its efficacy and safety. As a result, the investigation of the metabolic fate of a candidate drug is an essential part of drug design studies. Computational approaches have been developed for the prediction of possible drug metabolites in an effort to assist the traditional and resource-demanding experimental route. Current methodologies are based upon metabolic transformation rules, which are tied to specific enzyme families and therefore lack generalization, and additionally may involve manual work from experts limiting scalability. We present a rule-free, end-to-end learning-based method for predicting possible human metabolites of small molecules including drugs. The metabolite prediction task is approached as a sequence translation problem with chemical compounds represented using the SMILES notation. We perform transfer learning on a deep learning transformer model for sequence translation, originally trained on chemical reaction data, to predict the outcome of human metabolic reactions. We further build an ensemble model to account for multiple and diverse metabolites. Extensive evaluation reveals that the proposed method generalizes well to different enzyme families, as it can correctly predict metabolites through phase I and phase II drug metabolism as well as other enzymes. Compared to existing rule-based approaches, our method has equivalent performance on the major enzyme families while it additionally finds metabolites through less common enzymes. Our results indicate that the proposed approach can provide a comprehensive study of drug metabolism that does not restrict to the major enzyme families and does not require the extraction of transformation rules.