Browsing by Author "Segarra, Santiago"
Now showing 1 - 17 of 17
Results Per Page
Sort Options
Item Annealed Langevin Dynamics for MIMO Communications(2024-01-29) Zilberstein, Nicolas M; Segarra, Santiago; Sabharwal, AshutoshSolving the optimal data detection problem in multiple-input multiple-output (MIMO) systems is known to be NP-hard. Moreover, the difficulty is exacerbated when the channel state information is unavailable. In this work we propose a MIMO detector for the two scenarios, namely when the CSI is known and when it is unknown. First, for the case of perfect CSI, we proposed a MIMO detector based on an annealed version of Langevin dynamics. More precisely, we define a stochastic dynamical process whose stationary distribution coincides with the posterior distribution of the data given our observations. This allows us to approximate the maximum a posteriori estimator of the transmitted symbols by sampling from the proposed Langevin dynamic. We carefully craft this stochastic dynamic by gradually adding a sequence of noise with decreasing variance to the trajectories, which ensures that the estimated symbols belong to a pre-specified discrete constellation. Second, for the case of unknown CSI, we propose a joint data detection and channel estimation solution, where we define an annealed Langevin diffusion whose stationary distribution is the joint posterior of the channels and data given noisy observations.Item Automated detection of activity onset after postictal generalized EEG suppression(BioMed Central, 2020) Lamichhane, Bishal; Kim, Yejin; Segarra, Santiago; Zhang, Guoqiang; Lhatoo, Samden; Hampson, Jaison; Jiang, XiaoqianBackground: Sudden unexpected death in epilepsy (SUDEP) is a leading cause of premature death in patients with epilepsy. If timely assessment of SUDEP risk can be made, early interventions for optimized treatments might be provided. One of the biomarkers being investigated for SUDEP risk assessment is postictal generalized EEG suppression [postictal generalized EEG suppression (PGES)]. For example, prolonged PGES has been found to be associated with a higher risk for SUDEP. Accurate characterization of PGES requires correct identification of the end of PGES, which is often complicated due to signal noise and artifacts, and has been reported to be a difficult task even for trained clinical professionals. In this work we present a method for automatic detection of the end of PGES using multi-channel EEG recordings, thus enabling the downstream task of SUDEP risk assessment by PGES characterization. Methods: We address the detection of the end of PGES as a classification problem. Given a short EEG snippet, a trained model classifies whether it consists of the end of PGES or not. Scalp EEG recordings from a total of 134 patients with epilepsy are used for training a random forest based classification model. Various time-series based features are used to characterize the EEG signal for the classification task. The features that we have used are computationally inexpensive, making it suitable for real-time implementations and low-power solutions. The reference labels for classification are based on annotations by trained clinicians identifying the end of PGES in an EEG recording. Results: We evaluated our classification model on an independent test dataset from 34 epileptic patients and obtained an AUreceiver operating characteristic (ROC) (area under the curve) of 0.84. We found that inclusion of multiple EEG channels is important for better classification results, possibly owing to the generalized nature of PGES. Of among the channels included in our analysis, the central EEG channels were found to provide the best discriminative representation for the detection of the end of PGES. Conclusion: Accurate detection of the end of PGES is important for PGES characterization and SUDEP risk assessment. In this work, we showed that it is feasible to automatically detect the end of PGES—otherwise difficult to detect due to EEG noise and artifacts—using time-series features derived from multi-channel EEG recordings. In future work, we will explore deep learning based models for improved detection and investigate the downstream task of PGES characterization for SUDEP risk assessment.Item Bias-variance Trade-off and Uncertainty Quantification: Effects of Data Distribution in Image Classification(2022-11-18) Yilmaz, Fatih Furkan; Heckel, Reinhard; Segarra, SantiagoUnderstanding the training and generalization dynamics of deep neural networks as well as the actual accuracy of the network predictions when deployed in the wild are important open problems in machine learning. In this thesis, we study these two topics in the context of image classification. In the first part, we study the generalization properties of deep neural networks with respect to the regularization of the network training for standard image classification tasks. In the second part, we study the performance of conformal prediction based uncertainty estimation methods. Conformal prediction methods quantify the uncertainty of the predictions of a neural network in practical applications. We study the setup where the test distribution may induce a drop in the accuracy of the predictions due to distribution shift. The training of deep neural networks is often regularized either implicitly, for example by early stopping the gradient descent, or explicitly, by adding an $\ell_2$-penalty to the loss function, in order to prevent overfitting to spurious patterns or noise. Even though these regularization methods are well established in the literature, recently it was uncovered that the test error of the network can exhibit novel phenomena such as yielding a double descent shape with respect to the regularization amount. In the first part of this thesis, we develop a theoretical understanding of the double descent phenomenon with respect to model regularization. For this, we study regression tasks, in both the underparameterized and overparameterized regimes, for linear and non-linear models. We find that for linear regression, a double descent shaped risk is caused by a superposition of bias-variance tradeoffs corresponding to different parts of the data/model and can be mitigated by the proper scaling of the stepsizes or regularization strengths while improving the best-case performance. We next study a non-linear two-layer neural network and characterize the early-stopped gradient descent risk as a superposition of bias-variance tradeoffs and also show that double descent as a function of the L2-regularization coefficient occurs outside of the regime where the risk can be characterized using the existing tools in the literature. We empirically study deep networks trained on standard image classification datasets and show that our results well explain the dynamics of the network training. In the second part of this thesis, we consider the effects of data distribution shift at test time for standard deep neural network classifiers. While recent uncertainty quantification methods like conformal prediction can generate provably valid confidence measures for any pre-trained black-box image classifier, these guarantees fail when there is a distribution shift. We propose a simple test-time recalibration method based on only unlabeled examples that provides excellent uncertainty estimates under natural distribution shifts. We show that our method provably succeeds on a theoretical toy distribution shift problem. Empirically, we show the success of our method for various natural distribution shifts of the popular ImageNet dataset.Item Blind Network Inference(2021-08-12) Roddenberry, Thomas M; Segarra, SantiagoWe consider the extraction of “coarse” descriptions of networks strictly from observing data supported on their nodes. Taking a graph signal processing perspective, we model the observed data as the output of a graph filter applied to white noise. We then consider two tasks. First, we infer the eigenvector centrality ranking of the underlying graph, drawing connections between network diffusion processes, graph filters, and the power method for eigenvector computation. In doing so, we derive statistical guarantees for correctly comparing pairs of nodes in terms of their eigenvector centrality ranking from a simple PCA-type procedure. Second, we extract the community structure of a collection of planted partition graphs over a shared set of nodes from nodal data. In this case, we show that the optimal conditions for centrality ranking are suboptimal for community detection. We also derive statistical guarantees for estimating the number of communities in the underlying planted partition model as well as exactly inferring the community membership structure.Item Embargo Covariate Balancing Methods for Randomized Controlled Trials Are Not Adversarially Robust(2023-06-20) Babaei, Hossein; Baraniuk, Richard G; Segarra, Santiago; Sabharwal, AshutoshThe first step towards investigating the effectiveness of a treatment via a randomized trial is to split the population into control and treatment groups then compare the average response of the treatment group receiving the treatment to the control group receiving the placebo. In order to ensure that the difference between the two groups is caused only by the treatment, it is crucial that the control and the treatment groups have similar statistics. Indeed, the validity and reliability of a trial are determined by the similarity of two groups’ statistics. Covariate balancing methods increase the similarity between the distributions of the two groups’ covariates. However, often in practice, there are not enough samples to accurately estimate the groups’ covariate distributions. In this thesis, we empirically show that covariate balancing with the Standardized Means Difference (SMD) covariate balancing measure, as well as Pocock’s sequential treatment assignment method, are susceptible to worst-case treatment assignments. Worst-case treatment assignments are those admitted by the covariate balance measure, but result in highest possible ATE estimation errors. We developed an adversarial attack to find adversarial treatment assignment for any given trial. Then, we provide an index to measure how close the given trial is to the worst-case. To this end, iii we provide an optimization-based algorithm, namely Adversarial Treatment ASsignment in TREatment Effect Trials (ATASTREET), to find the adversarial treatment assignments.Item Estimation of Gaussian Graphical Models Using Learned Graph Priors(2024-08-06) Sevilla, Martin; Segarra, SantiagoWe propose a novel algorithm for estimating Gaussian graphical models incorporating prior information about the underlying graph. Classical approaches generally propose optimization problems with sparsity penalties as prior information. While efficient, these approaches do not allow using involved prior distributions and force us to incorporate the prior information on the precision matrix rather than on its support. In this work, we investigate how to estimate the graph of a Gaussian graphical model by introducing any prior distribution directly on the graph structure. We use graph neural networks to learn the score function of any graph prior and then leverage Langevin diffusion to generate samples from the posterior distribution. We study the estimation of both partially known and entirely unknown graphical models and prove that our proposed estimator is consistent in both scenarios. Finally, numerical experiments using synthetic and real-world graphs demonstrate the benefits of our approach.Item Embargo Graph-based Learning for Efficient Resource Allocation in Wireless Networks under Constraints(2024-08-08) Chowdhury, Arindam; Segarra, SantiagoOptimal allocation of resources, such as power and bandwidth, is essential for increasing spectral efficiency and improving effective network capacity to meet the high quality-of-service (QoS) requirements of modern wireless systems. This is especially challenging under randomly varying channel characteristics and user demands. In particular, power allocation in a wireless network is crucial to mitigate multi-user interference, one of the main performance-limiting factors. The task of interference management is framed as a utility maximization problem under instantaneous and/or time-varying power constraints. Such formulations are NP-hard, and the existing solutions are expensive yet sub-optimal at best. Recently, deep learning algorithms have been extensively employed to obtain approximate solvers efficiently. In particular, graph-based models have been shown to be most effective in leveraging the irregular connectivity structure of wireless networks. In this thesis, we focus on developing near-optimal, generalizable, lightweight, and robust Graph Neural Network (GNN)-based algorithms for effectively solving NP-hard optimization problems in wireless systems under instantaneous and time-varying constraints. The first part of this work specializes in designing domain-informed graph-ML algorithms by leveraging the paradigm of algorithm unfolding for fast and efficient instantaneous power allocation in SISO wireless ad hoc networks (WANET) with theoretically guaranteed convergence and robustness. The next part involves extending the unfolded solution and the theoretical analyses to address the optimal beamforming problem in MISO and MIMO interference networks under max-power constraints. The final part leverages constrained reinforcement learning algorithms for episodic sum-rate and harmonic-fairness maximization under time-varying battery constraints and channel conditions in mobile WANETs (MANET). Through these bodies of work, this thesis develops a unified framework for power allocation under time-coupled physical and utility constraints in wireless networks. Through \blue{simulation experiments}, we demonstrate a consistent performance improvement over SOTA models, both in terms of system utility and inference time. We also establish the generalization performance of the proposed models across multiple network topologies, sizes, fading conditions, and battery states. Further, we show that the proposed architectures are computationally efficient and can be executed with minimal hardware requirements. The hybrid structure of the models enhances interpretability as well as acts as a fail-safe in case the learnable components are no longer effective. Finally, the proposed framework is flexible and can be seamlessly applied to multiple tasks, including resource allocation, security, and control in wireless networks and systems.Item Hypergraph cuts with edge-dependent vertex weights(Springer Nature, 2022) Zhu, Yu; Segarra, SantiagoWe develop a framework for incorporating edge-dependent vertex weights (EDVWs) into the hypergraph minimum s-t cut problem. These weights are able to reflect different importance of vertices within a hyperedge, thus leading to better characterized cut properties. More precisely, we introduce a new class of hyperedge splitting functions that we call EDVWs-based, where the penalty of splitting a hyperedge depends only on the sum of EDVWs associated with the vertices on each side of the split. Moreover, we provide a way to construct submodular EDVWs-based splitting functions and prove that a hypergraph equipped with such splitting functions can be reduced to a graph sharing the same cut properties. In this case, the hypergraph minimum s-t cut problem can be solved using well-developed solutions to the graph minimum s-t cut problem. In addition, we show that an existing sparsification technique can be easily extended to our case and makes the reduced graph smaller and sparser, thus further accelerating the algorithms applied to the reduced graph. Numerical experiments using real-world data demonstrate the effectiveness of our proposed EDVWs-based splitting functions in comparison with the all-or-nothing splitting function and cardinality-based splitting functions commonly adopted in existing work.Item Hypergraphs with edge-dependent vertex weights: p-Laplacians and spectral clustering(Frontiers, 2023) Zhu, Yu; Segarra, SantiagoWe study p-Laplacians and spectral clustering for a recently proposed hypergraph model that incorporates edge-dependent vertex weights (EDVW). These weights can reflect different importance of vertices within a hyperedge, thus conferring the hypergraph model higher expressivity and flexibility. By constructing submodular EDVW-based splitting functions, we convert hypergraphs with EDVW into submodular hypergraphs for which the spectral theory is better developed. In this way, existing concepts and theorems such as p-Laplacians and Cheeger inequalities proposed under the submodular hypergraph setting can be directly extended to hypergraphs with EDVW. For submodular hypergraphs with EDVW-based splitting functions, we propose an efficient algorithm to compute the eigenvector associated with the second smallest eigenvalue of the hypergraph 1-Laplacian. We then utilize this eigenvector to cluster the vertices, achieving higher clustering accuracy than traditional spectral clustering based on the 2-Laplacian. More broadly, the proposed algorithm works for all submodular hypergraphs that are graph reducible. Numerical experiments using real-world data demonstrate the effectiveness of combining spectral clustering based on the 1-Laplacian and EDVW.Item Identifying the Topology of Undirected Networks From Diffused Non-Stationary Graph Signals(IEEE, 2021) Shafipour, Rasoul; Segarra, Santiago; Marques, Antonio G.; Mateos, GonzaloWe address the problem of inferring an undirected graph from nodal observations, which are modeled as non-stationary graph signals generated by local diffusion dynamics that depend on the structure of the unknown network. Using the so-called graph-shift operator (GSO), which is a matrix representation of the graph, we first identify the eigenvectors of the shift matrix from observations of the diffused signals, and then estimate the eigenvalues by imposing desirable properties on the graph to be recovered. Different from the stationary setting where the eigenvectors can be obtained directly from the covariance matrix of the measurements, here we need to estimate first the unknown diffusion (graph) filter - a polynomial in the GSO that preserves the sought eigenbasis. To carry out this initial system identification step, we exploit different sources of information on the arbitrarily-correlated input signal driving the diffusion on the graph. We first explore the setting where the observations, the input information, and the unknown graph filter are linearly related. We then address the case where the relation is given by a system of matrix quadratic equations, which arises in pragmatic scenarios where only the second-order statistics of the inputs are available. While such a quadratic filter identification problem boils down to a non-convex fourth-order polynomial minimization, we discuss identifiability conditions, propose algorithms to approximate the solution, and analyze their performance. Numerical tests illustrate the effectiveness of the proposed topology inference algorithms in recovering brain, social, financial, and urban transportation networks using synthetic and real-world signals.Item Inference of multiple sparse networks in the presence of hidden nodes(2023-04-19) Navarro, Madeline; Segarra, SantiagoWe investigate the increasingly prominent task of jointly inferring multiple networks from nodal observations. Joint network inference has been investigated extensively to expose the benefits of inferring multiple networks while accounting for their structural similarities. However, the primary assumption is that observations are available at all nodes, which is often violated in practice. In this thesis, we consider the realistic and more challenging scenario where a subset of nodes are hidden and cannot be measured. To address this ill-posed problem, we assume that there exist sets of graph signals that are stationary on the networks, which provides a global relationship between the observations and the network topologies such that we may characterize the effect of the hidden nodes. Under the assumptions that signals are stationary and the networks have similar connectivity patterns, we derive structural characteristics of the connectivity between hidden and observed nodes. This allows us to formulate an optimization problem for estimating multiple sparse networks while accounting for the influence of hidden nodes. We prove that convex relaxations maintain the sparsest solution under mild conditions, and we formalize the performance of our proposed optimization problem with respect to the effect of the hidden nodes. Finally, synthetic and real-world simulations validate the theoretical results and provide evaluations of our method in comparison with other state-of-the-art baselines.Item Joint embedding of biological networks for cross-species functional alignment(Oxford University Press, 2023) Li, Lechuan; Dannenfelser, Ruth; Zhu, Yu; Hejduk, Nathaniel; Segarra, Santiago; Yao, VickyModel organisms are widely used to better understand the molecular causes of human disease. While sequence similarity greatly aids this cross-species transfer, sequence similarity does not imply functional similarity, and thus, several current approaches incorporate protein–protein interactions to help map findings between species. Existing transfer methods either formulate the alignment problem as a matching problem which pits network features against known orthology, or more recently, as a joint embedding problem.We propose a novel state-of-the-art joint embedding solution: Embeddings to Network Alignment (ETNA). ETNA generates individual network embeddings based on network topological structure and then uses a Natural Language Processing-inspired cross-training approach to align the two embeddings using sequence-based orthologs. The final embedding preserves both within and between species gene functional relationships, and we demonstrate that it captures both pairwise and group functional relevance. In addition, ETNA’s embeddings can be used to transfer genetic interactions across species and identify phenotypic alignments, laying the groundwork for potential opportunities for drug repurposing and translational studies.https://github.com/ylaboratory/ETNAItem Learning on Inhomogeneous Hypergraphs(2023-04-17) Zhu, Yu; Segarra, SantiagoAlthough graphs are widely used in a myriad of machine learning tasks, they are limited to representing pairwise interactions. By contrast, in many real-world applications the entities engage in higher-order relations. Such relations can be modeled by hypergraphs, where the notion of an edge is generalized to a hyperedge that can connect more than two vertices. Traditional hypergraph models treat all the vertices in a hyperedge equally while in practice these vertices might contribute differently to the hyperedge. To deal with such cases, edge-dependent vertex weights (EDVWs) are introduced into hypergraphs which are able to reflect different importance of vertices within the same hyperedge. In this thesis, I study several fundamental problems considering the hypergraph model with EDVWs. First, I develop valid Laplacian matrices for this hypergraph model through random walks defined on vertices and hyperedges and incorporating EDVWs, based on which I propose spectral partitioning algorithms for co-clustering vertices and hyperedges. Second, I develop a framework for incorporating EDVWs into hypergraph cut problems via introducing a new class of hyperedge splitting functions which are both submodular and dependent on EDVWs. I also generalize existing reduction as well as sparsification techniques to our setting. Finally, I define p-Laplacians for this hypergraph model and focus on the p=1 case. I propose an efficient algorithm to compute the eigenvector associated with the second smallest eigenvalue of the 1-Laplacian and then use this eigenvector to cluster vertices in order to achieve better performance than traditional spectral clustering based on the 2-Laplacian.Item Reference-free structural variant detection in microbiomes via long-read co-assembly graphs(Oxford University Press, 2024) Curry, Kristen D; Yu, Feiqiao Brian; Vance, Summer E; Segarra, Santiago; Bhaya, Devaki; Chikhi, Rayan; Rocha, Eduardo P C; Treangen, Todd JMotivation: The study of bacterial genome dynamics is vital for understanding the mechanisms underlying microbial adaptation, growth, and their impact on host phenotype. Structural variants (SVs), genomic alterations of 50 base pairs or more, play a pivotal role in driving evolutionary processes and maintaining genomic heterogeneity within bacterial populations. While SV detection in isolate genomes is relatively straightforward, metagenomes present broader challenges due to the absence of clear reference genomes and the presence of mixed strains. In response, our proposed method rhea, forgoes reference genomes and metagenome-assembled genomes (MAGs) by encompassing all metagenomic samples in a series (time or other metric) into a single co-assembly graph. The log fold change in graph coverage between successive samples is then calculated to call SVs that are thriving or declining.Results: We show rhea to outperform existing methods for SV and horizontal gene transfer (HGT) detection in two simulated mock metagenomes, particularly as the simulated reads diverge from reference genomes and an increase in strain diversity is incorporated. We additionally demonstrate use cases for rhea on series metagenomic data of environmental and fermented food microbiomes to detect specific sequence alterations between successive time and temperature samples, suggesting host advantage. Our approach leverages previous work in assembly graph structural and coverage patterns to provide versatility in studying SVs across diverse and poorly characterized microbial communities for more comprehensive insights into microbial gene flux.Availability and implementation: rhea is open source and available at: https://github.com/treangenlab/rhea.Item Sampling and Limit Theories for Graph Signal Processing and Large Simplicial Complexes(2023-03-31) Roddenberry, T. Mitchell; Segarra, SantiagoThis thesis considers the role of locality and sampling in graph signal processing and network science. In light of the prevalence of extremely large and complex network datasets, it is a timely problem to consider how the study of these objects can be reduced to the study of a distribution of simpler objects. Indeed, many methods in graph signal processing and machine learning on graphs can be described strictly in light of local graph substructures. The approach taken to this problem starts with defining the notion of taking a sample from a network, and then builds this out to a probabilistic framework for network theory. This framework is applied to understand questions of graph parameter estimation, fundamentals of graph signal processing and Fourier analysis on graphs, transferability of machine learning and signal processing methods for network data, and finally to construct meaningful limiting objects of graphs and simplicial complexes. The study undertaken in this thesis both enhances the understanding of current methods, as well as inspires new methods in light of the notions of transferability defined by sampling.Item Signal processing on higher-order networks: Livin’ on the edge... and beyond(Elsevier, 2021) Schaub, Michael T.; Zhu, Yu; Seby, Jean-Baptiste; Roddenberry, T. Mitchell; Segarra, SantiagoIn this tutorial, we provide a didactic treatment of the emerging topic of signal processing on higher-order networks. Drawing analogies from discrete and graph signal processing, we introduce the building blocks for processing data on simplicial complexes and hypergraphs, two common higher-order network abstractions that can incorporate polyadic relationships. We provide brief introductions to simplicial complexes and hypergraphs, with a special emphasis on the concepts needed for the processing of signals supported on these structures. Specifically, we discuss Fourier analysis, signal denoising, signal interpolation, node embeddings, and nonlinear processing through neural networks, using these two higher-order network models. In the context of simplicial complexes, we specifically focus on signal processing using the Hodge Laplacian matrix, a multi-relational operator that leverages the special structure of simplicial complexes and generalizes desirable properties of the Laplacian matrix in graph signal processing. For hypergraphs, we present both matrix and tensor representations, and discuss the trade-offs in adopting one or the other. We also highlight limitations and potential research avenues, both to inform practitioners and to motivate the contribution of new researchers to the area.Item The Dual Graph Shift Operator: Identifying the Support of the Frequency Domain(Springer Nature, 2021) Leus, Geert; Segarra, Santiago; Ribeiro, Alejandro; Marques, Antonio G.Contemporary data is often supported by an irregular structure, which can be conveniently captured by a graph. Accounting for this graph support is crucial to analyze the data, leading to an area known as graph signal processing (GSP). The two most important tools in GSP are the graph shift operator (GSO), which is a sparse matrix accounting for the topology of the graph, and the graph Fourier transform (GFT), which maps graph signals into a frequency domain spanned by a number of graph-related Fourier-like basis vectors. This alternative representation of a graph signal is denominated the graph frequency signal. Several attempts have been undertaken in order to interpret the support of this graph frequency signal, but they all resulted in a one-dimensional interpretation. However, if the support of the original signal is captured by a graph, why would the graph frequency signal have a simple one-dimensional support? Departing from existing work, we propose an irregular support for the graph frequency signal, which we coin dual graph. A dual GSO leads to a better interpretation of the graph frequency signal and its domain, helps to understand how the different graph frequencies are related and clustered, enables the development of better graph filters and filter banks, and facilitates the generalization of classical SP results to the graph domain.