Browsing by Author "Xu, Yangyang"
Now showing 1 - 7 of 7
Results Per Page
Sort Options
Item A Block Coordinate Descent Method for Multi-Convex Optimization with Applications to Nonnegative Tensor Factorization and Completion(2012-08) Xu, Yangyang; Yin, WotaoThis paper considers block multi-convex optimization, where the feasible set and objective function are generally non-convex but convex in each block of variables. We review some of its interesting examples and propose a generalized block coordinate descent method. Under certain conditions, we show that any limit point satisfies the Nash equilibrium conditions. Furthermore, we establish its global convergence and estimate its asymptotic convergence rate by assuming a property based on the Kurdyka-Lojasiewicz inequality. The proposed algorithms are adapted for factorizing nonnegative matrices and tensors, as well as completing them from their incomplete observations. The algorithms were tested on synthetic data, hyperspectral data, as well as image sets from the CBCL and ORL databases. Compared to the existing state-of-the-art algorithms, the proposed algorithms demonstrate superior performance in both speed and solution quality. The Matlab code is available for download from the authors' homepages.Item An Alternating Direction Algorithm for Matrix Completion with Nonnegative Factors(2011-01) Xu, Yangyang; Yin, Wotao; Wen, Zaiwen; Zhang, YinThis paper introduces a novel algorithm for the nonnegative matrix factorization and completion problem, which aims to nd nonnegative matrices X and Y from a subset of entries of a nonnegative matrix M so that XY approximates M. This problem is closely related to the two existing problems: nonnegative matrix factorization and low-rank matrix completion, in the sense that it kills two birds with one stone. As it takes advantages of both nonnegativity and low rank, its results can be superior than those of the two problems alone. Our algorithm is applied to minimizing a non-convex constrained least-squares formulation and is based on the classic alternating direction augmented Lagrangian method. Preliminary convergence properties and numerical simulation results are presented. Compared to a recent algorithm for nonnegative random matrix factorization, the proposed algorithm yields comparable factorization through accessing only half of the matrix entries. On tasks of recovering incomplete grayscale and hyperspectral images, the results of the proposed algorithm have overall better qualities than those of two recent algorithms for matrix completion.Item Block Coordinate Descent for Regularized Multi-convex Optimization(2013-09-16) Xu, Yangyang; Yin, Wotao; Tapia, Richard A.; Zhang, Yin; Baraniuk, Richard G.This thesis considers regularized block multi-convex optimization, where the feasible set and objective function are generally non-convex but convex in each block of variables. I review some of its interesting examples and propose a generalized block coordinate descent (BCD) method. The generalized BCD uses three different block-update schemes. Based on the property of one block subproblem, one can freely choose one of the three schemes to update the corresponding block of variables. Appropriate choices of block-update schemes can often speed up the algorithm and greatly save computing time. Under certain conditions, I show that any limit point satisfies the Nash equilibrium conditions. Furthermore, I establish its global convergence and estimate its asymptotic convergence rate by assuming a property based on the Kurdyka-{\L}ojasiewicz inequality. As a consequence, this thesis gives a global linear convergence result of cyclic block coordinate descent for strongly convex optimization. The proposed algorithms are adapted for factorizing nonnegative matrices and tensors, as well as completing them from their incomplete observations. The algorithms were tested on synthetic data, hyperspectral data, as well as image sets from the CBCL, ORL and Swimmer databases. Compared to the existing state-of-the-art algorithms, the proposed algorithms demonstrate superior performance in both speed and solution quality.Item Block Coordinate Update Method in Tensor Optimization(2014-08-19) Xu, Yangyang; Yin, Wotao; Zhang, Yin; Allen, Genevera; Tapia, RichardBlock alternating minimization (BAM) has been popularly used since the 50's of last century. It partitions the variables into disjoint blocks and cyclically updates the blocks by minimizing the objective with respect to each block of variables, one at a time with all others fixed. A special case is the alternating projection method to find a common point of two convex sets. The BAM method is often easy yet efficient particularly if each block subproblem is simple to solve. However, for certain problems such as the nonnegative tensor decomposition, the block subproblems can be difficult to solve, or even if they are solved exactly or to high accuracies, BAM can perform badly on solving the original problem, in particular on non-convex problems. On the other hand, in the literature, the BAM method is mainly analyzed for convex problems. Although it has been shown numerically to work well on many non-convex problems, theoretical results of BAM for non-convex optimization are still lacked. For these reasons, I propose different block update schemes and generalize the BAM method for non-smooth non-convex optimization problems. Which scheme is the most efficient depends on specific applications. In addition, I analyze convergence of the generalized method, dubbed as block coordinate update method (BCU), with different block update schemes for non-smooth optimization problems, in both convex and non-convex cases. BCU has found many applications, and the work in this dissertation is mainly motivated by tensor optimization problems, for which the BCU method is often the best choice due to their block convexity. I make contributions in modeling, algorithm design, and also theoretical analysis. The first part is about the low-rank tensor completion, for which I make a novel model based on parallel low-rank matrix factorization. The new model is non-convex, and it is difficult to guarantee global optimal solutions. However, the BAM method performs very well on solving this model. Global convergence in terms of KKT conditions is established, and numerical experiments demonstrate the superiority of the proposed model over several state-of-the-art ones. The second part is towards the solution of the nonnegative tensor decomposition. For this problem, each block subproblem is a nonnegative least squares problem and not simple to solve. Hence, the BAM method may be inefficient. I propose a block proximal gradient (BPG) method. In contrast to BAM that solves each block subproblem exactly, BPG solves relaxed block subproblems, which are often much simpler than the original ones and can thus make BPG converge faster. Through the Kurdyka-Lojasiewicz property, I establish its global convergence with rate estimate in terms of iterate sequence. Numerical experiments on sparse nonnegative Tucker decomposition demonstrates its superiority over the BAM method. The last part is motivated by tensor regression problems, whose block partial gradient is expensive to evaluate. For such problems, BPG becomes inefficient, and I propose to use inexact partial gradient and generalize BPG to a block stochastic gradient method. Convergence results in expectation are established for general non-convex case in terms of first-order optimality conditions, and for convex case, a sublinear convergence rate result is shown. Numerical tests on tensor regression problems show that the block stochastic gradient method significantly outperforms its deterministic counterpart.Item Block Stochastic Gradient Iteration for Convex and Noncovex Optimization(2014-08) Xu, Yangyang; Yin, WotaoThe stochastic gradient (SG) method can minimize an objective function composed of a large number of differentiable functions or solve a stochastic optimization problem, very quickly to a moderate accuracy. The block coordinate descent/update (BCD) method, on the other hand, handles problems with multiple blocks of variables by updating them one at a time; when the blocks of variables are (much) easier to update individually than together, BCD has a (much) lower per-iteration cost. This paper introduces a method that combines the great features of SG and BCD for problems with many components in the objective and with multiple (blocks of) variables. Specifocally, a block stochastic gradient (BSG) method is proposed for both convex and nonconvex programs. At each iteration, BSG approximates the gradient of the differentiable part of the objective by randomly sampling a small set of data or sampling a few functions in the objective, and then, using the approximate gradient, it updates all the blocks of variables in either a deterministic or a randomly shuffed order. Its convergence for convex and nonconvex cases is established in different senses. In the convex case, the proposed method has the same order of convergence rate as the SG method. In the nonconvex case, its convergence is established in terms of the expected violation of a first-order optimality condition. The proposed method was numerically tested on problems including stochastic least squares and logistic regression, which are convex, as well as low-rank tensor recovery and bilinear logistic regression, which are nonconvex. On the convex problems, it performed as well as, and often significantly better, than the SG method. On the nonconvex problems, the proposed method BSG significantly outperformed the deterministic BCD method because the latter tends to slow down or stagnate near bad local minimizers. Overall, BSG inherits the benefits of both stochastic gradient approximation and block-coordinate updates.Item Learning Circulant Sensing Kernels(2012-01) Xu, Yangyang; Yin, Wotao; Osher, StanleyIn signal acquisition, Toeplitz and circulant matrices are widely used as sensing operators. They correspond to discrete convolutions and are easily or even naturally realized in various applications. For compressive sensing, recent work has used random Toeplitz and circulant sensing matrices and proved their efficiency in theory, by computer simulations, as well as through physical optical experiments. Motivated by a recent work by Duarte-Carvajalino and Sapiro, we propose models to learn a circulant sensing matrix/operator for one and higher dimensional signals. Given the dictionary of the signal(s) to be sensed, the learned circulant sensing matrix/operator is more effective than a randomly generated circulant sensing matrix/operator, and even slightly so than a Gaussian random sensing matrix. In addition, by exploiting the circulant structure, we improve the learning from the patch scale in the work by Duarte-Carvajalino and Sapiro to the much large image scale. Furthermore, we test learning the circulant sensing matrix/operator and the nonparametric dictionary altogether and obtain even better performance. We demonstrate these results using both synthetic sparse signals and real images.Item Low-Rank Matrix Recovery using Unconstrained Smoothed-Lq Minimization(2011-09) Lai, Ming-Jun; Xu, Yangyang; Yin, WotaoA low-rank matrix can be recovered from a small number of its linear measurements. As a special case, the matrix completion problem aims to recover the matrix from a subset of its entries. Such problems share many common features with those recovering sparse vectors. In this paper, we extend nonconvex Lq minimization and iteratively reweighted algorithms from recovering sparse vectors to recovering low-rank matrices. Unlike most existing work, this work focuses on unconstrained Lq minimization, for which we show a few advantages on noisy measurements and/or approximately low-rank matrices. Based on results in Daubechies-DeVore-Fornasier-Güntürk '2010 for constrained Lq minimization, we start with a preliminary yet novel analysis for unconstrained Lq minimization for sparse vectors, which includes convergence, error bound, and local convergence rates. Then, the algorithm and analysis is extended to the recovery of low-rank matrices. The algorithm has been compared to some existing state-of-the-arts and shows superior performance on recovering low-rank matrices with fast-decaying singular values from incomplete measurements.