Repository logo
English
  • English
  • Català
  • Čeština
  • Deutsch
  • Español
  • Français
  • Gàidhlig
  • Italiano
  • Latviešu
  • Magyar
  • Nederlands
  • Polski
  • Português
  • Português do Brasil
  • Suomi
  • Svenska
  • Türkçe
  • Tiếng Việt
  • Қазақ
  • বাংলা
  • हिंदी
  • Ελληνικά
  • Yкраї́нська
  • Log In
    or
    New user? Click here to register.Have you forgotten your password?
Repository logo
  • Communities & Collections
  • All of R-3
English
  • English
  • Català
  • Čeština
  • Deutsch
  • Español
  • Français
  • Gàidhlig
  • Italiano
  • Latviešu
  • Magyar
  • Nederlands
  • Polski
  • Português
  • Português do Brasil
  • Suomi
  • Svenska
  • Türkçe
  • Tiếng Việt
  • Қазақ
  • বাংলা
  • हिंदी
  • Ελληνικά
  • Yкраї́нська
  • Log In
    or
    New user? Click here to register.Have you forgotten your password?
  1. Home
  2. Browse by Author

Browsing by Author "Chi, Eric C."

Now showing 1 - 4 of 4
Results Per Page
Sort Options
  • Loading...
    Thumbnail Image
    Item
    A convex-nonconvex strategy for grouped variable selection
    (Project Euclid, 2023) Liu, Xiaoqian; Molstad, Aaron J.; Chi, Eric C.
    This paper deals with the grouped variable selection problem. A widely used strategy is to augment the negative log-likelihood function with a sparsity-promoting penalty. Existing methods include the group Lasso, group SCAD, and group MCP. The group Lasso solves a convex optimization problem but suffers from underestimation bias. The group SCAD and group MCP avoid this estimation bias but require solving a nonconvex optimization problem that may be plagued by suboptimal local optima. In this work, we propose an alternative method based on the generalized minimax concave (GMC) penalty, which is a folded concave penalty that maintains the convexity of the objective function. We develop a new method for grouped variable selection in linear regression, the group GMC, that generalizes the strategy of the original GMC estimator. We present a primal-dual algorithm for computing the group GMC estimator and also prove properties of the solution path to guide its numerical computation and tuning parameter selection in practice. We establish error bounds for both the group GMC and original GMC estimators. A rich set of simulation studies and a real data application indicate that the proposed group GMC approach outperforms existing methods in several different aspects under a wide array of scenarios.
  • Loading...
    Thumbnail Image
    Item
    Estimating a common period for a set of irregularly sampled functions with applications to periodic variable star data
    (Project Euclid, 2016) Long, James P.; Chi, Eric C.; Baraniuk, Richard G.
    We consider the problem of estimating a common period for a set of functions sampled at irregular intervals. The motivating problem arises in astronomy, where the functions represent a star’s observed brightness over time through different photometric filters. While current methods perform well when the brightness is sampled densely enough in at least one filter, they break down when no brightness function is densely sampled. In this paper we introduce two new methods for period estimation in this important latter case. The first, multiband generalized Lomb–Scargle (MGLS), extends the frequently used Lomb–Scargle method to naïvely combine information across filters. The second, penalized generalized Lomb–Scargle (PGLS), builds on MGLS by more intelligently borrowing strength across filters. Specifically, we incorporate constraints on the phases and amplitudes across the different functions using a nonconvex penalized likelihood function. We develop a fast algorithm to optimize the penalized likelihood that combines block coordinate descent with the majorization–minimization (MM) principle. We test and validate our methods on synthetic and real astronomy data. Both PGLS and MGLS improve period estimation accuracy over current methods based on using a single function; moreover, PGLS outperforms MGLS and other leading methods when the functions are sparsely sampled.
  • Loading...
    Thumbnail Image
    Item
    Imaging genetics via sparse canonical correlation analysis
    (IEEE, 2013) Chi, Eric C.; Allen, Genevera I.; Zhou, Hua; Kohannim, Omid; Lange, Kenneth; Thompson, Paul M.
    The collection of brain images from populations of subjects who have been genotyped with genome-wide scans makes it feasible to search for genetic effects on the brain. Even so, multivariate methods are sorely needed that can search both images and the genome for relationships, making use of the correlation structure of both datasets. Here we investigate the use of sparse canonical correlation analysis (CCA) to home in on sets of genetic variants that explain variance in a set of images. We extend recent work on penalized matrix decomposition to account for the correlations in both datasets. Such methods show promise in imaging genetics as they exploit the natural covariance in the datasets. They also avoid an astronomically heavy statistical correction for searching the whole genome and the entire image for promising associations.
  • Loading...
    Thumbnail Image
    Item
    Parametric classification and variable selection by the minimum integrated squared error criterion
    (2012) Chi, Eric C.; Scott, David W.
    This thesis presents a robust solution to the classification and variable selection problem when the dimension of the data, or number of predictor variables, may greatly exceed the number of observations. When faced with the problem of classifying objects given many measured attributes of the objects, the goal is to build a model that makes the most accurate predictions using only the most meaningful subset of the available measurements. The introduction of [cursive l] 1 regularized model titling has inspired many approaches that simultaneously do model fitting and variable selection. If parametric models are employed, the standard approach is some form of regularized maximum likelihood estimation. While this is an asymptotically efficient procedure under very general conditions, it is not robust. Outliers can negatively impact both estimation and variable selection. Moreover, outliers can be very difficult to identify as the number of predictor variables becomes large. Minimizing the integrated squared error, or L 2 error, while less efficient, has been shown to generate parametric estimators that are robust to a fair amount of contamination in several contexts. In this thesis, we present a novel robust parametric regression model for the binary classification problem based on L 2 distance, the logistic L 2 estimator (L 2 E). To perform simultaneous model fitting and variable selection among correlated predictors in the high dimensional setting, an elastic net penalty is introduced. A fast computational algorithm for minimizing the elastic net penalized logistic L 2 E loss is derived and results on the algorithm's global convergence properties are given. Through simulations we demonstrate the utility of the penalized logistic L 2 E at robustly recovering sparse models from high dimensional data in the presence of outliers and inliers. Results on real genomic data are also presented.
  • About R-3
  • Report a Digital Accessibility Issue
  • Request Accessible Formats
  • Fondren Library
  • Contact Us
  • FAQ
  • Privacy Notice
  • R-3 Policies

Physical Address:

6100 Main Street, Houston, Texas 77005

Mailing Address:

MS-44, P.O.BOX 1892, Houston, Texas 77251-1892