Browsing by Author "Peterson, Christine"
Now showing 1 - 5 of 5
Results Per Page
Sort Options
Item Bayesian graphical models for biological network inference(2013-11-20) Peterson, Christine; Vannucci, Marina; Ensor, Katherine B.; Kavraki, Lydia E.; Maletic-Savatic, Mirjana; Stingo, Francesco C.In this work, we propose approaches for the inference of graphical models in the Bayesian framework. Graphical models, which use a network structure to represent conditional dependencies among random variables, provide a valuable tool for visualizing and understanding the relationships among many variables. However, since these networks are complex systems, they can be difficult to infer given a limited number of observations. Our research is focused on development of methods which allow incorporation of prior information on particular edges or on the model structure to improve the reliability of inference given small to moderate sample sizes. First, we propose an approach to graphical model inference using the Bayesian graphical lasso. Our method incorporates informative priors on the shrinkage parameters specific to each edge. We demonstrate through simulations that this method allows improved learning of the network structure when relevant prior information is available, and illustrate the approach on inference of the cellular metabolic network under neuroinflammation. This application highlights the strength of our method since the number of samples available is fairly small, but we are able to draw on rich reference information from publicly available databases describing known metabolic interactions to construct informative priors. Next, we propose a modeling approach for settings where we would like to estimate networks for a collection of possibly related sample groups, where the sample size for each subgroup may be limited. We use a Markov random field prior to link the graphs within each group, and a selection prior to infer which groups have shared network structure. This allows us to encourage common edges across sample groups, when supported by the data. We provide simulation studies to illustrate the properties of our method and compare its performance to competing approaches. We conclude by demonstrating use of the proposed method to infer protein networks for various subtypes of acute myeloid leukemia and to infer signaling networks under different experimental perturbations.Item Bayesian Graphical Network Analyses Reveal Complex Biological Interactions Specific to Alzheimer's Disease(IOS Press, 2015) Rembach, Alan; Stingo, Francesco C.; Peterson, Christine; Vannucci, Marina; Do, Kim-Anh; Wilson, William J.; Macaulay, S. Lance; Ryan, Timothy M.; Martins, Ralph N.; Ames, David; Masters, Colin L.; Doecke, James D.; The AIBL Research GroupWith different approaches to finding prognostic or diagnostic biomarkers for Alzheimer's disease (AD), many studies pursue only brief lists of biomarkers or disease specific pathways, potentially dismissing information from groups of correlated biomarkers. Using a novel Bayesian graphical network method, with data from the Australian Imaging, Biomarkers and Lifestyle (AIBL) study of aging, the aim of this study was to assess the biological connectivity between AD associated blood-based proteins. Briefly, three groups of protein markers (18, 37, and 48 proteins, respectively) were assessed for the posterior probability of biological connection both within and between clinical classifications. Clinical classification was defined in four groups: high performance healthy controls (hpHC), healthy controls (HC), participants with mild cognitive impairment (MCI), and participants with AD. Using the smaller group of proteins, posterior probabilities of network similarity between clinical classifications were very high, indicating no difference in biological connections between groups. Increasing the number of proteins increased the capacity to separate both hpHC and HC apart from the AD group (0 for complete separation, 1 for complete similarity), with posterior probabilities shifting from 0.89 for the 18 protein group, through to 0.54 for the 37 protein group, and finally 0.28 for the 48 protein group. Using this approach, we identified beta-2 microglobulin (β2M) as a potential master regulator of multiple proteins across all classifications, demonstrating that this approach can be used across many data sets to identify novel insights into diseases like AD.Item Hierarchical Normalized Completely Random Measures for Robust Graphical Modeling(Project Euclid, 2019) Cremaschi, Andrea; Argiento, Raffaele; Shoemaker, Katherine; Peterson, Christine; Vannucci, MarinaGaussian graphical models are useful tools for exploring network structures in multivariate normal data. In this paper we are interested in situations where data show departures from Gaussianity, therefore requiring alternative modeling distributions. The multivariate t-distribution, obtained by dividing each component of the data vector by a gamma random variable, is a straightforward generalization to accommodate deviations from normality such as heavy tails. Since different groups of variables may be contaminated to a different extent, Finegold and Drton (2014) introduced the Dirichlet t-distribution, where the divisors are clustered using a Dirichlet process. In this work, we consider a more general class of nonparametric distributions as the prior on the divisor terms, namely the class of normalized completely random measures (NormCRMs). To improve the effectiveness of the clustering, we propose modeling the dependence among the divisors through a nonparametric hierarchical structure, which allows for the sharing of parameters across the samples in the data set. This desirable feature enables us to cluster together different components of multivariate data in a parsimonious way. We demonstrate through simulations that this approach provides accurate graphical model inference, and apply it to a case study examining the dependence structure in radiomics data derived from The Cancer Imaging Atlas.Item Inferring metabolic networks using the Bayesian adaptive graphical lasso with informative priors(International Press, 2013) Peterson, Christine; Vannucci, Marina; Karakas, Cemal; Choi, William; Ma, Lihua; Maletic-Savatic, MirjanaMetabolic processes are essential for cellular function and survival. We are interested in inferring a metabolic network in activated microglia, a major neuroimmune cell in the brain responsible for the neuroinflammation associated with neurological diseases, based on a set of quantified metabolites. To achieve this, we apply the Bayesian adaptive graphical lasso with informative priors that incorporate known relationships between covariates. To encourage sparsity, the Bayesian graphical lasso places double exponential priors on the off-diagonal entries of the precision matrix. The Bayesian adaptive graphical lasso allows each double exponential prior to have a unique shrinkage parameter. These shrinkage parameters share a common gamma hyperprior. We extend this model to create an informative prior structure by formulating tailored hyperpriors on the shrinkage parameters. By choosing parameter values for each hyperprior that shift probability mass toward zero for nodes that are close together in a reference network, we encourage edges between covariates with known relationships. This approach can improve the reliability of network inference when the sample size is small relative to the number of parameters to be estimated. When applied to the data on activated microglia, the inferred network includes both known relationships and associations of potential interest for further investigation.Item Regularized partial least squares with an application to NMR spectroscopy(John Wiley & Sons, Inc., 2013) Allen, Genevera I.; Peterson, Christine; Vannucci, Marina; Maletic-Savatic, MirjanaHigh-dimensional data common in genomics, proteomics, and chemometrics often contains complicated correlation structures. Recently, partial least squares (PLS) and Sparse PLS methods have gained attention in these areas as dimension reduction techniques in the context of supervised data analysis. We introduce a framework for Regularized PLS by solving a relaxation of the SIMPLS optimization problem with penalties on the PLS loadings vectors. Our approach enjoys many advantages including flexibility, general penalties, easy interpretation of results, and fast computation in high-dimensional settings. We also outline extensions of our methods leading to novel methods for non-negative PLS and generalized PLS, an adoption of PLS for structured data. We demonstrate the utility of our methods through simulations and a case study on proton Nuclear Magnetic Resonance (NMR) spectroscopy data.