Browsing by Author "Vannucci, Marina"
Now showing 1 - 20 of 66
Results Per Page
Sort Options
Item A Bayesian approach for capturing daily heterogeneity in intra-daily durations time series(De Gruyter, 2013) Brownlees, Christian T.; Vannucci, MarinaIntra-daily financial durations time series typically exhibit evidence of long range dependence. This has motivated the introduction of models able to reproduce this stylized fact, like the Fractionally Integrated Autoregressive Conditional Duration Model. In this work we introduce a novel specification able to capture long range dependence. We propose a three component model that consists of an autoregressive daily random effect, a semiparametric time-of-day effect and an intra-daily dynamic component: the Mixed Autoregressive Conditional Duration (Mixed ACD) Model. The random effect component allows for heterogeneity in mean reversal within a day and captures low frequency dynamics in the duration time series. The joint estimation of the model parameters is carried out using MCMC techniques based on the Bayesian formulation of the model. The empirical application to a set of widely traded US tickers shows that the model is able to capture low frequency dependence in duration time series. We also find that the degree of dependence and dispersion of low frequency dynamics is higher in periods of higher financial distress.Item A Bayesian Integrative Model for Genetical Genomics with Spatially Informed Variable Selection(Libertas Academica, 2014) Cassese, Alberto; Guindani, Michele; Vannucci, MarinaWe consider a Bayesian hierarchical model for the integration of gene expression levels with comparative genomic hybridization (CGH) array measurements collected on the same subjects. The approach defines a measurement error model that relates the gene expression levels to latent copy number states. In turn, the latent states are related to the observed surrogate CGH measurements via a hidden Markov model. The model further incorpo-rates variable selection with a spatial prior based on a probit link that exploits dependencies across adjacent DNA segments. Posterior inference is carried out via Markov chain Monte Carlo stochastic search techniques. We study the performance of the model in simulations and show better results than those achieved with recently proposed alternative priors. We also show an application to data from a genomic study on lung squamous cell carcinoma, where we identify potential candidates of associations between copy number variants and the transcriptional activity of target genes. Gene ontology (GO) analyses of our findings reveal enrichments in genes that code for proteins involved in cancer. Our model also identifies a number of potential candidate biomarkers for further experimental validation.Item A Bayesian model for the identification of differentially expressed genes in Daphnia magna exposed to munition pollutants(Wiley, 2015) Cassese, Alberto; Guindani, Michele; Antczak, Philipp; Falciani, Francesco; Vannucci, MarinaIn this article we propose a Bayesian hierarchical model for the identification of differentially expressed genes in Daphnia magna organisms exposed to chemical compounds, specifically munition pollutants in water. The model we propose constitutes one of the very first attempts at a rigorous modeling of the biological effects of water purification. We have data acquired from a purification system that comprises four consecutive purification stages, which we refer to as "ponds," of progressively more contaminated water. We model the expected expression of a gene in a pond as the sum of the mean of the same gene in the previous pond plus a gene-pond specific difference. We incorporate a variable selection mechanism for the identification of the differential expressions, with a prior distribution on the probability of a change that accounts for the available information on the concentration of chemical compounds present in the water. We carry out posterior inference via MCMC stochastic search techniques. In the application, we reduce the complexity of the data by grouping genes according to their functional characteristics, based on the KEGG pathway database. This also increases the biological interpretability of the results. Our model successfully identifies a number of pathways that show differential expression between consecutive purification stages. We also find that changes in the transcriptional response are more strongly associated to the presence of certain compounds, with the remaining contributing to a lesser extent. We discuss the sensitivity of these results to the model parameters that measure the influence of the prior information on the posterior inference.Item A Bayesian Nonparametric Approach for Functional Data Classification with Application to Hepatic Tissue Characterization(Libertas Academica, 2015) Fronczyk, Kassandra M.; Guindani, Michele; Hobbs, Brian P.; Ng, Chaan S.; Vannucci, MarinaComputed tomography perfusion (CTp) is an emerging functional imaging technology that provides a quantitative assessment of the passage of fluid through blood vessels. Tissue perfusion plays a critical role in oncology due to the proliferation of networks of new blood vessels typical of cancer angiogenesis, which triggers modifications to the vasculature of the surrounding host tissue. In this article, we consider a Bayesian semiparametric model for the analysis of functional data. This method is applied to a study of four interdependent hepatic perfusion CT characteristics that were acquired under the administration of contrast using a sequence of repeated scans over a period of 590 seconds. More specifically, our modeling framework facilitates borrowing of information across patients and tissues. Additionally, the approach enables flexible estimation of temporal correlation structures exhibited by mappings of the correlated perfusion biomarkers and thus accounts for the heteroskedasticity typically observed in those measurements, by incorporating change-points in the covariance estimation. This method is applied to measurements obtained from regions of liver surrounding malignant and benign tissues, for each perfusion biomarker. We demonstrate how to cluster the liver regions on the basis of their CTp profiles, which can be used in a prediction context to classify regions of interest provided by future patients, and thereby assist in discriminating malignant from healthy tissue regions in diagnostic settings.Item A Bayesian Nonparametric Spiked Process Prior for Dynamic Model Selection(Project Euclid, 2019) Cassese, Alberto; Zhu, Weixuan; Guindani, Michele; Vannucci, MarinaIn many applications, investigators monitor processes that vary in space and time, with the goal of identifying temporally persistent and spatially localized departures from a baseline or “normal” behavior. In this manuscript, we consider the monitoring of pneumonia and influenza (P&I) mortality, to detect influenza outbreaks in the continental United States, and propose a Bayesian nonparametric model selection approach to take into account the spatio-temporal dependence of outbreaks. More specifically, we introduce a zero-inflated conditionally identically distributed species sampling prior which allows borrowing information across time and to assign data to clusters associated to either a null or an alternate process. Spatial dependences are accounted for by means of a Markov random field prior, which allows to inform the selection based on inferences conducted at nearby locations. We show how the proposed modeling framework performs in an application to the P&I mortality data and in a simulation study, and compare with common threshold methods for detecting outbreaks over time, with more recent Markov switching based models, and with spike-and-slab Bayesian nonparametric priors that do not take into account spatio-temporal dependence.Item A Bayesian switching linear dynamical system for estimating seizure chronotypes(National Academy of Sciences, 2022) Wang, Emily T.; Vannucci, Marina; Haneef, Zulfi; Moss, Robert; Rao, Vikram R.; Chiang, SharonEpilepsy is a disorder characterized by paroxysmal transitions between multistable states. Dynamical systems have been useful for modeling the paroxysmal nature of seizures. At the same time, intracranial electroencephalography (EEG) recordings have recently discovered that an electrographic measure of epileptogenicity, interictal epileptiform activity, exhibits cycling patterns ranging from ultradian to multidien rhythmicity, with seizures phase-locked to specific phases of these latent cycles. However, many mechanistic questions about seizure cycles remain unanswered. Here, we provide a principled approach to recast the modeling of seizure chronotypes within a statistical dynamical systems framework by developing a Bayesian switching linear dynamical system (SLDS) with variable selection to estimate latent seizure cycles. We propose a Markov chain Monte Carlo algorithm that employs particle Gibbs with ancestral sampling to estimate latent cycles in epilepsy and apply unsupervised learning on spectral features of latent cycles to uncover clusters in cycling tendency. We analyze the largest database of patient-reported seizures in the world to comprehensively characterize multidien cycling patterns among 1,012 people with epilepsy, spanning from infancy to older adulthood. Our work advances knowledge of cycling in epilepsy by investigating how multidien seizure cycles vary in people with epilepsy, while demonstrating an application of an SLDS to frame seizure cycling within a nonlinear dynamical systems framework. It also lays the groundwork for future studies to pursue data-driven hypothesis generation regarding the mechanistic drivers of seizure cycles.Item A Hierarchical Bayesian Model for the Identification of PET Markers Associated to the Prediction of Surgical Outcome after Anterior Temporal Lobe Resection(Frontiers Media S.A., 2017) Chiang, Sharon; Guindani, Michele; Yeh, Hsiang J.; Dewar, Sandra; Haneef, Zulfi; Stern, John M.; Vannucci, MarinaWe develop an integrative Bayesian predictive modeling framework that identifies individual pathological brain states based on the selection of fluoro-deoxyglucose positron emission tomography (PET) imaging biomarkers and evaluates the association of those states with a clinical outcome. We consider data from a study on temporal lobe epilepsy (TLE) patients who subsequently underwent anterior temporal lobe resection. Our modeling framework looks at the observed profiles of regional glucose metabolism in PET as the phenotypic manifestation of a latent individual pathologic state, which is assumed to vary across the population. The modeling strategy we adopt allows the identification of patient subgroups characterized by latent pathologies differentially associated to the clinical outcome of interest. It also identifies imaging biomarkers characterizing the pathological states of the subjects. In the data application, we identify a subgroup of TLE patients at high risk for post-surgical seizure recurrence after anterior temporal lobe resection, together with a set of discriminatory brain regions that can be used to distinguish the latent subgroups. We show that the proposed method achieves high cross-validated accuracy in predicting post-surgical seizure recurrence.Item A Network Biology Approach Identifies Molecular Cross-Talk between Normal Prostate Epithelial and Prostate Carcinoma Cells(Public Library of Science, 2016) Trevino, Victor; Cassese, Alberto; Nagy, Zsuzsanna; Zhuang, Xiaodong; Herbert, John; Antzack, Philipp; Clarke, Kim; Davies, Nicholas; Rahman, Ayesha; Campbell, Moray J.; Guindani, Michele; Bicknell, Roy; Vannucci, Marina; Falciani, FrancescoThe advent of functional genomics has enabled the genome-wide characterization of the molecular state of cells and tissues, virtually at every level of biological organization. The difficulty in organizing and mining this unprecedented amount of information has stimulated the development of computational methods designed to infer the underlying structure of regulatory networks from observational data. These important developments had a profound impact in biological sciences since they triggered the development of a novel data-driven investigative approach. In cancer research, this strategy has been particularly successful. It has contributed to the identification of novel biomarkers, to a better characterization of disease heterogeneity and to a more in depth understanding of cancer pathophysiology. However, so far these approaches have not explicitly addressed the challenge of identifying networks representing the interaction of different cell types in a complex tissue. Since these interactions represent an essential part of the biology of both diseased and healthy tissues, it is of paramount importance that this challenge is addressed. Here we report the definition of a network reverse engineering strategy designed to infer directional signals linking adjacent cell types within a complex tissue. The application of this inference strategy to prostate cancer genome-wide expression profiling data validated the approach and revealed that normal epithelial cells exert an anti-tumour activity on prostate carcinoma cells. Moreover, by using a Bayesian hierarchical model integrating genetics and gene expression data and combining this with survival analysis, we show that the expression of putative cell communication genes related to focal adhesion and secretion is affected by epistatic gene copy number variation and it is predictive of patient survival. Ultimately, this study represents a generalizable approach to the challenge of deciphering cell communication networks in a wide spectrum of biological systems.Item A predictor-informed multi-subject bayesian approach for dynamic functional connectivity(Public Library of Science, 2024) Lee, Jaylen; Hussain, Sana; Warnick, Ryan; Vannucci, Marina; Menchaca, Isaac; Seitz, Aaron R.; Hu, Xiaoping; Peters, Megan A. K.; Guindani, MicheleDynamic functional connectivity investigates how the interactions among brain regions vary over the course of an fMRI experiment. Such transitions between different individual connectivity states can be modulated by changes in underlying physiological mechanisms that drive functional network dynamics, e.g., changes in attention or cognitive effort. In this paper, we develop a multi-subject Bayesian framework where the estimation of dynamic functional networks is informed by time-varying exogenous physiological covariates that are simultaneously recorded in each subject during the fMRI experiment. More specifically, we consider a dynamic Gaussian graphical model approach where a non-homogeneous hidden Markov model is employed to classify the fMRI time series into latent neurological states. We assume the state-transition probabilities to vary over time and across subjects as a function of the underlying covariates, allowing for the estimation of recurrent connectivity patterns and the sharing of networks among the subjects. We further assume sparsity in the network structures via shrinkage priors, and achieve edge selection in the estimated graph structures by introducing a multi-comparison procedure for shrinkage-based inferences with Bayesian false discovery rate control. We evaluate the performances of our method vs alternative approaches on synthetic data. We apply our modeling framework on a resting-state experiment where fMRI data have been collected concurrently with pupillometry measurements, as a proxy of cognitive processing, and assess the heterogeneity of the effects of changes in pupil dilation on the subjects’ propensity to change connectivity states. The heterogeneity of state occupancy across subjects provides an understanding of the relationship between increased pupil dilation and transitions toward different cognitive states.Item A systems biology approach reveals common metastatic pathways in osteosarcoma(BioMed Central, 2012) Flores, Ricardo J.; Li, Yiting; Yu, Alexander; Shen, Jianhe; Rao, Pulivarthi H.; Lau, Serrine S.; Vannucci, Marina; Lau, Ching C.; Man, Tsz-KwongBackground: Osteosarcoma (OS) is the most common malignant bone tumor in children and adolescents. The survival rate of patients with metastatic disease remains very dismal. Nevertheless, metastasis is a complex process and a single-level analysis is not likely to identify its key biological determinants. In this study, we used a systems biology approach to identify common metastatic pathways that are jointly supported by both mRNA and protein expression data in two distinct human metastatic OS models. Results: mRNA expression microarray and N-linked glycoproteomic analyses were performed on two commonly used isogenic pairs of human metastatic OS cell lines, namely HOS/143B and SaOS-2/LM7. Pathway analysis of the differentially regulated genes and glycoproteins separately revealed pathways associated to metastasis including cell cycle regulation, immune response, and epithelial-to-mesenchymal-transition. However, no common significant pathway was found at both genomic and proteomic levels between the two metastatic models, suggesting a very different biological nature of the cell lines. To address this issue, we used a topological significance analysis based on a “shortest-path” algorithm to identify topological nodes, which uncovered additional biological information with respect to the genomic and glycoproteomic profiles but remained hidden from the direct analyses. Pathway analysis of the significant topological nodes revealed a striking concordance between the models and identified significant common pathways, including “Cytoskeleton remodeling/TGF/WNT”, “Cytoskeleton remodeling/Cytoskeleton remodeling”, and “Cell adhesion/Chemokines and adhesion”. Of these, the “Cytoskeleton remodeling/TGF/WNT” was the top ranked common pathway from the topological analysis of the genomic and proteomic profiles in the two metastatic models. The up-regulation of proteins in the “Cytoskeleton remodeling/TGF/WNT” pathway in the SaOS-2/LM7 and HOS/143B models was further validated using an orthogonal Reverse Phase Protein Array platform. Conclusions: In this study, we used a systems biology approach by integrating genomic and proteomic data to identify key and common metastatic mechanisms in OS. The use of the topological analysis revealed hidden biological pathways that are known to play critical roles in metastasis. Wnt signaling has been previously implicated in OS and other tumors, and inhibitors of Wnt signaling pathways are available for clinical testing. Further characterization of this common pathway and other topological pathways identified from this study may lead to a novel therapeutic strategy for the treatment of metastatic OS.Item Advanced Bayesian Models for Dependent Data(2023-04-18) Zeng, Zijian; Li, Meng; Vannucci, MarinaOver the past few years, there has been a noticeable increase in the amount of available data with complex dependent structure. Bayesian statistics is an approach to inference based on the Bayes’ theorem, which is interpretable and provides uncertainty quantification. These advantages have made Bayesian methods widely used across various applied fields, including social sciences, ecology, genetics, medicine and more. In this thesis, we advance the application of Bayesian methods for three different types of dependent data. For the first project, we develop a Bayesian median autoregressive model for time series forecasting. This model utilizes time-varying quantile regression at the median, which inherits the robustness of median regression in contrast to the widely used mean-based methods. We use Bayesian model averaging to account for model uncertainty including the uncertainty in the autoregressive order, in addition to a Bayesian model selection approach. The second project addresses image-on-scaler regression. We consider a Bayesian hierarchical Gaussian process model for image smoothing, that uses a flexible Inverse-Wishart process prior to handle within-image dependency. We propose a general global-local spatial selection prior that achieves simultaneous global (i.e., at the covariate-level) and local (i.e., at the pixel/voxel-level) selection. We introduce participation rate parameters that measure the probability for individual covariates to affect the observed images. This along with a hard-thresholding strategy leads to dependency between selections at the two levels, introduces extra sparsity at the local level, and allows the global selection to be informed by the local selection, all in a model-based manner. The last project is on Gaussian graphical regression models with covariates. We use a tensor representation of the regression coefficients to describe the multi-level selection action achieved by the proposed prior: covariate-level, edge-level and local-level. Simultaneous multi-level selection is done by nesting a global-local spike-and-slab prior in a sparse group selection prior. This nested prior first achieves a global-level selection, excluding a covariate, by measuring the probability of the covariate being influential, and then, conditional on the outcome, performs edge-level selection in the manner of conventional Gaussian graphical regression models. In a fully Bayes approach, we design Markov Chain Monte Carlo (MCMC) samplers for all three models and show in simulations and real data applications (with U.S. macroeconomic data, Autism brain imaging data, and human gene expression data respectively) that the proposed Bayesian methods are competitive with respect to existing models. Furthermore, the proposed Bayesian methods are also highly interpretable and able to provide joint uncertainty quantification via posterior samples for prediction and/or inference.Item Advances in Bayesian Approaches for Directed and Undirected Graphical Models(2021-08-05) Osborne, Nathan; Vannucci, Marina; Peterson, Christine B.In recent years, there has been a growing interest in the use of graphical models to understand the dependence relationships among random variables. Graphical models can capture both directed and undirected relationships, which may arise from a variety of underlying distributions. This flexibility has made them applicable across many fields of study. In this thesis we advance the research on Bayesian methods for graphical model inference by providing novel approaches for both directed and undirected networks. We first introduce a simultaneous estimation approach for multiple Gaussian graphs that links the precision matrix entries across groups. This approach enables more accurate estimation and is accompanied by an efficient Gibbs sampling scheme. Next, we outline a Variational-EM algorithm for a Bayesian hierarchical model to estimate the latent network of compositional count data, while also selecting relevant external covariates. This model proves useful as we improve estimation of underlying networks while gaining insights into the effects of the covariates. Finally, we develop a multi-subject vector autoregression model with group level graph estimation and allow the cross-subject variance to be a function of covariates. We use variational inference for estimation and find that accounting for the cross-subject variance leads to more accurate group level edge selection. We illustrate these methods with applications to brain imaging and microbiome data.Item An integrative Bayesian Dirichlet-multinomial regression model for the analysis of taxonomic abundances in microbiome data(BioMed Central, 2017) Wadsworth, W. Duncan; Argiento, Raffaele; Guindani, Michele; Galloway-Pena, Jessica; Shelbourne, Samuel A.; Vannucci, MarinaAbstract Background The Human Microbiome has been variously associated with the immune-regulatory mechanisms involved in the prevention or development of many non-infectious human diseases such as autoimmunity, allergy and cancer. Integrative approaches which aim at associating the composition of the human microbiome with other available information, such as clinical covariates and environmental predictors, are paramount to develop a more complete understanding of the role of microbiome in disease development. Results In this manuscript, we propose a Bayesian Dirichlet-Multinomial regression model which uses spike-and-slab priors for the selection of significant associations between a set of available covariates and taxa from a microbiome abundance table. The approach allows straightforward incorporation of the covariates through a log-linear regression parametrization of the parameters of the Dirichlet-Multinomial likelihood. Inference is conducted through a Markov Chain Monte Carlo algorithm, and selection of the significant covariates is based upon the assessment of posterior probabilities of inclusions and the thresholding of the Bayesian false discovery rate. We design a simulation study to evaluate the performance of the proposed method, and then apply our model on a publicly available dataset obtained from the Human Microbiome Project which associates taxa abundances with KEGG orthology pathways. The method is implemented in specifically developed R code, which has been made publicly available. Conclusions Our method compares favorably in simulations to several recently proposed approaches for similarly structured data, in terms of increased accuracy and reduced false positive as well as false negative rates. In the application to the data from the Human Microbiome Project, a close evaluation of the biological significance of our findings confirms existing associations in the literature.Item Bayesian feature selection for radiomics using reliability metrics(Frontiers Media S.A., 2023) Shoemaker, Katherine; Ger, Rachel; Court, Laurence E.; Aerts, Hugo; Vannucci, Marina; Peterson, Christine B.Introduction: Imaging of tumors is a standard step in diagnosing cancer and making subsequent treatment decisions. The field of radiomics aims to develop imaging based biomarkers using methods rooted in artificial intelligence applied to medical imaging. However, a challenging aspect of developing predictive models for clinical use is that many quantitative features derived from image data exhibit instability or lack of reproducibility across different imaging systems or image-processing pipelines.Methods: To address this challenge, we propose a Bayesian sparse modeling approach for image classification based on radiomic features, where the inclusion of more reliable features is favored via a probit prior formulation.Results: We verify through simulation studies that this approach can improve feature selection and prediction given correct prior information. Finally, we illustrate the method with an application to the classification of head and neck cancer patients by human papillomavirus status, using as our prior information a reliability metric quantifying feature stability across different imaging systems.Item Bayesian graphical models for biological network inference(2013-11-20) Peterson, Christine; Vannucci, Marina; Ensor, Katherine B.; Kavraki, Lydia E.; Maletic-Savatic, Mirjana; Stingo, Francesco C.In this work, we propose approaches for the inference of graphical models in the Bayesian framework. Graphical models, which use a network structure to represent conditional dependencies among random variables, provide a valuable tool for visualizing and understanding the relationships among many variables. However, since these networks are complex systems, they can be difficult to infer given a limited number of observations. Our research is focused on development of methods which allow incorporation of prior information on particular edges or on the model structure to improve the reliability of inference given small to moderate sample sizes. First, we propose an approach to graphical model inference using the Bayesian graphical lasso. Our method incorporates informative priors on the shrinkage parameters specific to each edge. We demonstrate through simulations that this method allows improved learning of the network structure when relevant prior information is available, and illustrate the approach on inference of the cellular metabolic network under neuroinflammation. This application highlights the strength of our method since the number of samples available is fairly small, but we are able to draw on rich reference information from publicly available databases describing known metabolic interactions to construct informative priors. Next, we propose a modeling approach for settings where we would like to estimate networks for a collection of possibly related sample groups, where the sample size for each subgroup may be limited. We use a Markov random field prior to link the graphs within each group, and a selection prior to infer which groups have shared network structure. This allows us to encourage common edges across sample groups, when supported by the data. We provide simulation studies to illustrate the properties of our method and compare its performance to competing approaches. We conclude by demonstrating use of the proposed method to infer protein networks for various subtypes of acute myeloid leukemia and to infer signaling networks under different experimental perturbations.Item Bayesian graphical models for complex biological networks(2015-12-04) Ni, Yang; Vannucci, Marina; Stingo, Francesco CIn this thesis, we propose novel Bayesian methodologies in estimating graphical models from complex genomic/health data, for which traditional methods are often found to be inefficient and unsuitable. Our approaches are motivated by various applications including construction of non-linear gene regulatory networks, data integration, cancer surveillance and precision medicine. This thesis consists of three projects. First, we develop a novel semi/non-parametric directed acyclic graphical model to reconstruct gene regulatory network from cancer gene expression data. The regulatory relationship between genes is assumed to be sparse and is allowed to be nonlinear, which is modeled by penalized splines with a spike-and-slab selection prior. We impose a discrete mixture prior on the smoothing parameter of the splines so that we are able to distinguish between linear and nonlinear relationships. Simulation studies show good performance of our approach in comparison with competing methods. Application to GBM data reveals several interesting findings. Second, we propose a multi-dimensional graphical model based on Cholesky-type decomposition of precision matrices to study the conditional independences of multi-dimensional data that are constituted by measurements along multiple axes. Our proposed approach is a unified framework applicable to both directed and undirected graphs as well as arbitrary combinations of these. We develop efficient sampling algorithm based on partially collapsed Gibbs samplers. Simulation studies show that our method has favorable performance against both benchmark and state-of-the-art approaches. We apply our approach to ovarian cancer protein expression data and U.S. cancer mortality data. Third, we propose a novel class of graphical models, graphical regression, which allow graph structure to vary with additional covariates in a flexible fashion. We impose sparsity in both graph structure and covariates. Our approach produces subject-specific graph and predictive graph for new subject. We provide theoretical property and demonstrate the good performance of our method through simulation studies. Finally, we apply our approach to multiple myeloma gene expression data taking prognostic factors as covariates, which reveals several interesting findings.Item Bayesian graphical models for modern biological applications(Springer Nature, 2022) Ni, Yang; Baladandayuthapani, Veerabhadran; Vannucci, Marina; Stingo, Francesco C.Graphical models are powerful tools that are regularly used to investigate complex dependence structures in high-throughput biomedical datasets. They allow for holistic, systems-level view of the various biological processes, for intuitive and rigorous understanding and interpretations. In the context of large networks, Bayesian approaches are particularly suitable because it encourages sparsity of the graphs, incorporate prior information, and most importantly account for uncertainty in the graph structure. These features are particularly important in applications with limited sample size, including genomics and imaging studies. In this paper, we review several recently developed techniques for the analysis of large networks under non-standard settings, including but not limited to, multiple graphs for data observed from multiple related subgroups, graphical regression approaches used for the analysis of networks that change with covariates, and other complex sampling and structural settings. We also illustrate the practical utility of some of these methods using examples in cancer genomics and neuroimaging.Item Bayesian graphical models for modern biological applications(Springer Nature, 2021) Ni, Yang; Baladandayuthapani, Veerabhadran; Vannucci, Marina; Stingo, Francesco C.Graphical models are powerful tools that are regularly used to investigate complex dependence structures in high-throughput biomedical datasets. They allow for holistic, systems-level view of the various biological processes, for intuitive and rigorous understanding and interpretations. In the context of large networks, Bayesian approaches are particularly suitable because it encourages sparsity of the graphs, incorporate prior information, and most importantly account for uncertainty in the graph structure. These features are particularly important in applications with limited sample size, including genomics and imaging studies. In this paper, we review several recently developed techniques for the analysis of large networks under non-standard settings, including but not limited to, multiple graphs for data observed from multiple related subgroups, graphical regression approaches used for the analysis of networks that change with covariates, and other complex sampling and structural settings. We also illustrate the practical utility of some of these methods using examples in cancer genomics and neuroimaging.Item Bayesian Graphical Models for Multiple Networks(2019-04-16) Shaddox, Elin Brooke; Vannucci, MarinaIn recent years, novel methods for graphical model inference have been widely applied to infer biological networks for high throughput data. When studying complex diseases, these network-based inferential approaches can be crucial in evaluating patterns of variable association and determining the cellular level changes influencing disease susceptibility, progression, and variation across heterogeneous subjects. This thesis is focused on developing flexible joint graph methodology for estimating network structures of multiple sample groups. These approaches employ Gaussian graphical models, Markov random field priors, continuous shrinkage priors, and Dirichlet process priors to infer graph structures for each sample group, while sharing information between related graphs without assuming any similarity. These methods are illustrated through simulation studies and applications to case studies of Chronic Obstructive Pulmonary Disease (COPD).Item Bayesian Graphical Models for Multivariate Time Series(2022-12-02) Liu, Chunshan; Kowal, Daniel R.; Vannucci, MarinaGaussian graphical models are widely popular for studying the conditional dependence among random variables. By encoding conditional dependence as an undirected graph, these models provide interpretable representations and insightful visualizations of the relationships among variables. However, time series data often violate the assumptions of Gaussian graphical models. In time series, the data are often not iid; the graphs can evolve over time, with changes occurring at unknown time points. We first extend Bayesian graphical models to time series data with heavy tailed characteristics. We introduce a Dynamic and Robust Gaussian Graphical model, which is able to identify dynamics in the graph, share information across time, and estimate graphs from highly contaminated data. We then consider the scenario where the data are less contaminated and close to smooth curves. We introduce a Dynamic Bayesian Functional Graphical Model, where the observed data is viewed as realizations of random functions varying over a continuum of time. Unlike the dynamic and robust time series model, each node in the functional graphical model represents a function. The model inserts a change point in time and estimates two different graphs before and after the change point. The proposed methods demonstrate excellent graph estimation for simulated data with improvements over existing graphical models. We apply these methods in various applications, including gesture tracing data, futures return data and sea surface temperature data, and discover meaningful edges and dynamics.