Browsing by Author "Kowal, Daniel R."
Now showing 1 - 7 of 7
Results Per Page
Sort Options
Item A Bayesian Multivariate Functional Dynamic Linear Model(Taylor & Francis, 2017) Kowal, Daniel R.; Matteson, David S.; Ruppert, DavidWe present a Bayesian approach for modeling multivariate, dependent functional data. To account for the three dominant structural features in the data—functional, time dependent, and multivariate components—we extend hierarchical dynamic linear models for multivariate time series to the functional data setting. We also develop Bayesian spline theory in a more general constrained optimization framework. The proposed methods identify a time-invariant functional basis for the functional observations, which is smooth and interpretable, and can be made common across multivariate observations for additional information sharing. The Bayesian framework permits joint estimation of the model parameters, provides exact inference (up to MCMC error) on specific parameters, and allows generalized dependence structures. Sampling from the posterior distribution is accomplished with an efficient Gibbs sampling algorithm. We illustrate the proposed framework with two applications: (1) multi-economy yield curve data from the recent global recession, and (2) local field potential brain signals in rats, for which we develop a multivariate functional time series approach for multivariate time–frequency analysis. Supplementary materials, including R code and the multi-economy yield curve data, are available online.Item Bayesian data synthesis and the utility-risk trade-off for mixed epidemiological data(Project Euclid, 2022) Feldman, Joseph; Kowal, Daniel R.Much of the microdata used for epidemiological studies contain sensitive measurements on real individuals. As a result, such microdata cannot be published out of privacy concerns, and without public access to these data, any statistical analyses originally published on them are nearly impossible to reproduce. To promote the dissemination of key datasets for analysis without jeopardizing the privacy of individuals, we introduce a cohesive Bayesian framework for the generation of fully synthetic high-dimensional microdatasets of mixed categorical, binary, count, and continuous variables. This process centers around a joint Bayesian model that is simultaneously compatible with all of these data types, enabling the creation of mixed synthetic datasets through posterior predictive sampling. Furthermore, a focal point of epidemiological data analysis is the study of conditional relationships between various exposures and key outcome variables through regression analysis. We design a modified data synthesis strategy to target and preserve these conditional relationships, including both nonlinearities and interactions. The proposed techniques are deployed to create a synthetic version of a confidential dataset containing dozens of health, cognitive, and social measurements on nearly 20,000 North Carolina children.Item Dynamic Regression Models for Time-Ordered Functional Data(Project Euclid, 2021) Kowal, Daniel R.For time-ordered functional data, an important yet challenging task is to forecast functional observations with uncertainty quantification. Scalar predictors are often observed concurrently with functional data and provide valuable information about the dynamics of the functional time series. We develop a fully Bayesian framework for dynamic functional regression, which employs scalar predictors to model the time-evolution of functional data. Functional within-curve dependence is modeled using unknown basis functions, which are learned from the data. The unknown basis provides substantial dimension reduction, which is essential for scalable computing, and may incorporate prior knowledge such as smoothness or periodicity. The dynamics of the time-ordered functional data are specified using a time-varying parameter regression model in which the effects of the scalar predictors evolve over time. To guard against overfitting, we design shrinkage priors that regularize irrelevant predictors and shrink toward time-invariance. Simulation studies decisively confirm the utility of these modeling and prior choices. Posterior inference is available via a customized Gibbs sampler, which offers unrivaled scalability for Bayesian dynamic functional regression. The methodology is applied to model and forecast yield curves using macroeconomic predictors, and demonstrates exceptional forecasting accuracy and uncertainty quantification over the span of four decades.Item Dynamic shrinkage processes(Wiley, 2019) Kowal, Daniel R.; Matteson, David S.; Ruppert, DavidWe propose a novel class of dynamic shrinkage processes for Bayesian time series and regression analysis. Building on a global–local framework of prior construction, in which continuous scale mixtures of Gaussian distributions are employed for both desirable shrinkage properties and computational tractability, we model dependence between the local scale parameters. The resulting processes inherit the desirable shrinkage behaviour of popular global–local priors, such as the horseshoe prior, but provide additional localized adaptivity, which is important for modelling time series data or regression functions with local features. We construct a computationally efficient Gibbs sampling algorithm based on a Pólya–gamma scale mixture representation of the process proposed. Using dynamic shrinkage processes, we develop a Bayesian trend filtering model that produces more accurate estimates and tighter posterior credible intervals than do competing methods, and we apply the model for irregular curve fitting of minute‐by‐minute Twitter central processor unit usage data. In addition, we develop an adaptive time varying parameter regression model to assess the efficacy of the Fama–French five‐factor asset pricing model with momentum added as a sixth factor. Our dynamic analysis of manufacturing and healthcare industry data shows that, with the exception of the market risk, no other risk factors are significant except for brief periods.Item Fast, Optimal, and Targeted Predictions Using Parameterized Decision Analysis(Taylor & Francis, 2022) Kowal, Daniel R.Prediction is critical for decision-making under uncertainty and lends validity to statistical inference. With targeted prediction, the goal is to optimize predictions for specific decision tasks of interest, which we represent via functionals. Although classical decision analysis extracts predictions from a Bayesian model, these predictions are often difficult to interpret and slow to compute. Instead, we design a class of parameterized actions for Bayesian decision analysis that produce optimal, scalable, and simple targeted predictions. For a wide variety of action parameterizations and loss functions—including linear actions with sparsity constraints for targeted variable selection—we derive a convenient representation of the optimal targeted prediction that yields efficient and interpretable solutions. Customized out-of-sample predictive metrics are developed to evaluate and compare among targeted predictors. Through careful use of the posterior predictive distribution, we introduce a procedure that identifies a set of near-optimal, or acceptable targeted predictors, which provide unique insights into the features and level of complexity needed for accurate targeted prediction. Simulations demonstrate excellent prediction, estimation, and variable selection capabilities. Targeted predictions are constructed for physical activity (PA) data from the National Health and Nutrition Examination Survey to better predict and understand the characteristics of intraday PA. Supplementary materials for this article are available online.Item Semiparametric count data regression for self-reported mental health(Wiley, 2023) Kowal, Daniel R.; Wu, Bohan‘‘For how many days during the past 30 days was your mental health not good?” The responses to this question measure self-reported mental health and can be linked to important covariates in the National Health and Nutrition Examination Survey (NHANES). However, these count variables present major distributional challenges: The data are overdispersed, zero-inflated, bounded by 30, and heaped in 5- and 7-day increments. To address these challenges—which are especially common for health questionnaire data—we design a semiparametric estimation and inference framework for count data regression. The data-generating process is defined by simultaneously transforming and rounding (star) a latent Gaussian regression model. The transformation is estimated nonparametrically and the rounding operator ensures the correct support for the discrete and bounded data. Maximum likelihood estimators are computed using an expectation-maximization (EM) algorithm that is compatible with any continuous data model estimable by least squares. star regression includes asymptotic hypothesis testing and confidence intervals, variable selection via information criteria, and customized diagnostics. Simulation studies validate the utility of this framework. Using star regression, we identify key factors associated with self-reported mental health and demonstrate substantial improvements in goodness-of-fit compared to existing count data regression models.Item Stochastic clustering and pattern matching for real-time geosteering(Society of Exploration Geophysicists, 2019) Wu, Mingqi; Miao, Yinsen; Panchal, Neilkunal; Kowal, Daniel R.; Vannucci, Marina; Vila, Jeremy; Liang, FamingWe have developed a Bayesian statistical framework for quantitative geosteering in real time. Two types of contemporary geosteering approaches, model based and stratification based, are introduced. The latter is formulated as a Bayesian optimization procedure: The log from a pilot reference well is used as a stratigraphic signature of the geologic structure in a given region; the observed log sequence acquired along the wellbore is projected into the stratigraphic domain given a proposed earth model and directional survey; the pattern similarity between the converted log and the signature is measured by a correlation coefficient; then stochastic searching is performed on the space of all possible earth models to maximize the similarity under constraints of the prior understanding of the drilling process and target formation; finally, an inference is made based on the samples simulated from the posterior distribution using stochastic approximation Monte Carlo in which we extract the most likely earth model and the associated credible intervals as a quantified confidence indicator. We extensively test our method using synthetic and real geosteering data sets. Our method consistently achieves good performance on synthetic data sets with high correlations between the interpreted and the reference logs and provides similar interpretations as the geosteering geologists on four real wells. We also conduct a reliability performance test of the method on a benchmark set of 200 horizontal wells randomly sampled from the Permian Basin. Our Bayesian framework informs geologists with key drilling decisions in real time and helps them navigate the drilling bit into the target formation with confidence.