Browsing by Author "Hand, Paul E"
Now showing 1 - 2 of 2
Results Per Page
Sort Options
Item A Convex Algorithm for Mixed Linear Regression(2017-03-22) Joshi, Babhru; Hand, Paul EMixed linear regression is a high dimensional affine space clustering problem where the goal is to find the parameters of multiple affine spaces that best fit a collection of points. We introduce a convex 2nd order cone program (based on l1/fused lasso) which allows us to reformulate the mixed linear regression as an Rd clustering problem. The convex program is parameter free and does not require prior knowledge of the number of clusters, which is more tractable while clustering in Rd. In the noiseless case, we prove that the convex program recovers the regression coefficients exactly under narrow technical conditions of well-separation and balance. We demonstrate numerical performance on BikeShare data and music tone perception data.Item Convergence of K-indicators Clustering with Alternating Projection Algorithms(2017-11-21) Yang, Yuchen; Zhang, Yin; Schaefer, Andrew J.; Hand, Paul EData clustering is a fundamental unsupervised machine learning problem, and the most widely used method of data clustering over the decades is k-means. Recently, a newly proposed algorithm called KindAP, based on the idea of subspace matching and a semi-convex relaxation scheme, outperforms k-means in many aspects, such as no random replication and insensitivity to initialization. Unlike k-means, empirical evidence suggests that KindAP can correctly identify well-separated globular clusters even when the number of clusters is large, but a rigorous theoretical analysis is necessary. This study improves the algorithm design and establishes the first-step theory for KindAP. KindAP is actually a two-layered alternating projection procedure applied to two non-convex sets. The inner loop solves an intermediate model via a semi-convex relaxation scheme that relaxes one more complicated non-convex set while keeping the other intact. We first derive a convergence result for this inner loop. Then under the “ideal data” assumption where n data points are exactly located at k positions, we prove that KindAP converges globally to the global minimum with the help of outer loop. Further work is ongoing to extend this analysis from the ideal data case to more general cases.