Browsing by Author "Dun, Chen"
Now showing 1 - 3 of 3
Results Per Page
Sort Options
Item A General Method for Efficient Distributed Training and Federated Learning in Synchronous and Asynchronous Scenarios(2023-11-30) Dun, Chen; Kyrillidis, AnastasiosIn the past decades of development of machine learning systems, there is an eternal conflict: model performance versus model scale versus computation resources. The never ended desire to improve model performance significantly increases the size of machine learning model, the size of training dataset and the training time, while the available computation resources are generally limited due to limited memory size and computation power of computation devices and limited data usage (due to data storage or user privacy). In general, there are two main research attempts to solve such eternal conflict. The first attempt focuses on decreasing the needed computation resources. Accordingly, synchronous distributed training systems (such as data parallelism and model parallelism) and asynchronous distributed training system have been widely studied. Further, federated learning system has been researched to address the additional restriction of data usage due to data privacy or data storage. The second attempt to solve the eternal conflict instead focuses on improving model performance with limited model scale with Mixture of Expert (MoE) system. As we find there is hidden shared essence between these two directions, we aim to create a general methodology that can solve the problems met in both directions mentioned above. We propose a novel methodology that partitions, randomly or by a controlled method, the large neural network model into smaller subnetworks, each of which is distributed to local workers, trained independently and synchronized periodically. For the first direction, we demonstrate, with theoretical guarantee and empirical experiments, that such methodology can be applied in both synchronous and asynchronous systems, in different model architectures, and in both distributed training and federated learning, in most cases significantly reducing communication, memory and computation cost. For the second direction, we demonstrate that such methodology can significantly improve the model performance in MoE system without increasing model scale, by guiding the training of specialized experts. We also demonstrate our methodology can be applied to MoE systems on both traditional deep learning model and recent Large Language Model (LLM).Item CrysFormer: Protein structure determination via Patterson maps, deep learning, and partial structure attention(AIP Publishing LLC, 2024) Pan, Tom; Dun, Chen; Jin, Shikai; Miller, Mitchell D.; Kyrillidis, Anastasios; Phillips, George N., Jr.Determining the atomic-level structure of a protein has been a decades-long challenge. However, recent advances in transformers and related neural network architectures have enabled researchers to significantly improve solutions to this problem. These methods use large datasets of sequence information and corresponding known protein template structures, if available. Yet, such methods only focus on sequence information. Other available prior knowledge could also be utilized, such as constructs derived from x-ray crystallography experiments and the known structures of the most common conformations of amino acid residues, which we refer to as partial structures. To the best of our knowledge, we propose the first transformer-based model that directly utilizes experimental protein crystallographic data and partial structure information to calculate electron density maps of proteins. In particular, we use Patterson maps, which can be directly obtained from x-ray crystallography experimental data, thus bypassing the well-known crystallographic phase problem. We demonstrate that our method, CrysFormer, achieves precise predictions on two synthetic datasets of peptide fragments in crystalline forms, one with two residues per unit cell and the other with fifteen. These predictions can then be used to generate accurate atomic models using established crystallographic refinement programs.Item Current progress and open challenges for applying deep learning across the biosciences(Springer Nature, 2022) Sapoval, Nicolae; Aghazadeh, Amirali; Nute, Michael G.; Antunes, Dinler A.; Balaji, Advait; Baraniuk, Richard; Barberan, C.J.; Dannenfelser, Ruth; Dun, Chen; Edrisi, Mohammadamin; Elworth, R.A. Leo; Kille, Bryce; Kyrillidis, Anastasios; Nakhleh, Luay; Wolfe, Cameron R.; Yan, Zhi; Yao, Vicky; Treangen, Todd J.Deep Learning (DL) has recently enabled unprecedented advances in one of the grand challenges in computational biology: the half-century-old problem of protein structure prediction. In this paper we discuss recent advances, limitations, and future perspectives of DL on five broad areas: protein structure prediction, protein function prediction, genome engineering, systems biology and data integration, and phylogenetic inference. We discuss each application area and cover the main bottlenecks of DL approaches, such as training data, problem scope, and the ability to leverage existing DL architectures in new contexts. To conclude, we provide a summary of the subject-specific and general challenges for DL across the biosciences.