Browsing by Author "Ray, Bonnie K."
Now showing 1 - 2 of 2
Results Per Page
Sort Options
Item Model-based clustering for multivariate time series of counts(2010) Thomas, Sarah Julia; Ensor, Katherine B.; Ray, Bonnie K.This dissertation develops a modeling framework for univariate and multivariate zero-inflated time series of counts and applies the models in a clustering scheme to identify groups of count series with similar behavior. The basic modeling framework used is observation-driven Poisson regression with generalized linear model (GLM) structure. The zero-inflated Poisson (ZIP) model is employed to characterize the possibility of extra observed zeros relative to the Poisson, a common feature of count data. These two methods are combined to characterize time series of counts where the counts and the probability of extra zeros may depend on past data observations and on exogenous covariates. A key contribution of this work is a novel modeling paradigm for multivariate zero-inflated counts. The three related models considered are the jointly-inflated, the marginally-inflated, and the doubly-inflated multivariate Poisson. The doubly-inflated model encompasses both marginal-inflation, which allows for additional zeros at each time epoch for each individual count series, and joint-inflation, which allows for zero-inflation across all multivariate series. These models improve upon previously proposed models, which are either too rigid or too simplistic to be applicable in a wide variety of applications. To estimate the model parameters, a new Monte Carlo Estimation Maximization (MCEM) algorithm is developed. The Monte Carlo sampling eliminates complex recursion formulas needed for calculating the probability function of the multivariate Poisson. The algorithm is easily adapted for different multivariate zero-inflation schemes. The new models, new estimation methods, and applications in clustering are demonstrated on simulated and real datasets. For an application in finance, the number of trades and the number of price changes for bonds are modeled as a bivariate doubly zero-inflated Poisson time series, where observations of zero trades or zero price changes represent the liquidity risk for that bond. In an environmental science application, the new models are used in a model-based clustering scheme to study counts of high pollution events at air quality monitoring stations around Houston, Texas. Clustering reveals regions of the air monitoring network which behave similarly in terms of time dependence and response to covariates representing atmospheric conditions and physical sources of air pollution.Item Point source influence on observed extreme pollution levels in a monitoring network(Elsevier, 2014) Ensor, Katherine B.; Ray, Bonnie K.; Charlton, Sarah J.This paper presents a strategy to quantify the influence major point sources in a region have on extreme pollution values observed at each of the monitors in the network. We focus on the number of hours in a day the levels at a monitor exceed a specified health threshold. The number of daily exceedances are modeled using observation-driven negative binomial time series regression models, allowing for a zero-inflation component to characterize the probability of no exceedances in a particular day. The spatial nature of the problem is addressed through the use of a Gaussian plume model for atmospheric dispersion computed at locations of known emissions, creating covariates that impact exceedances. In order to isolate the influence of emitters at individual monitors, we fit separate regression models to the series of counts from each monitor. We apply a final model clustering step to group monitor series that exhibit similar behavior with respect to mean, variability, and common contributors to support policy decision making. The methodology is applied to eight benzene pollution series measured at air quality monitors around the Houston ship channel, a major industrial port.