Learning from SOM Voronoi Tessellations: Applications to Density Estimation and Clustering
dc.contributor.advisor | Merenyi, Erzsebet | en_US |
dc.creator | Taylor, Josh | en_US |
dc.date.accessioned | 2020-04-27T18:45:01Z | en_US |
dc.date.available | 2020-04-27T18:45:01Z | en_US |
dc.date.created | 2020-05 | en_US |
dc.date.issued | 2020-04-24 | en_US |
dc.date.submitted | May 2020 | en_US |
dc.date.updated | 2020-04-27T18:45:01Z | en_US |
dc.description.abstract | This work collectively demonstrates how a study of the geometry of the Voronoi tessellations induced by manifold learning via Self-Organizing Maps (SOMs) can inform the solution of two standard problems in unsupervised learning: density estimation and clustering. In Part I we introduce methodology based on SOMs for specifying locally adaptive bandwidths for non-parametric kernel density estimation. Bandwidth selection for this new estimator, dubbed VECS (Voronoi Ellipsoidal Coordinate Smoothing), exploits both the CADJ graph of SOM prototypes and the geometries of the Voronoi regions it connects; this helps VECS avoid the computational issues stemming from minimization of the traditional error functionals used in density estimation, making it suitable for use in higher dimensions. Combining VECS estimates with the organization of the SOM output lattice offers a new visualization for high dimensional exploratory data analysis called VECSvis. In Part II we develop DMPrune, an automated method for removing edges of a CADJ graph based on a Dirichlet-Multinomial model of its edge weights. As a topology representing graph, CADJ guides the most sophisticated SOM-based clusterings. DMPrune offers intelligent graph pruning by assigning a score to each CADJ edge and, consulting their likelihood, monitors the impacts of edge removal to maximize edge sparsity while minimizing loss of cohesion in the most important parts of the graph. This brings us closer to high quality, automated cluster extraction from a learned SOM. For exposition, we exercise both methodologies on MER, a 7-band multispectral image of the Columbia Hills region of the Martian landscape imaged by the Mars Exploratoration Rover Spirit. Automated clusterings of the image spectra parameterized by DMPrune are shown to be of comparable quality to a previous, scientifically verified, MER clustering guided by human interaction. Additional structure appearing from these automated clusterings appears valid when compared to the VECSvis of this image. | en_US |
dc.format.mimetype | application/pdf | en_US |
dc.identifier.citation | Taylor, Josh. "Learning from SOM Voronoi Tessellations: Applications to Density Estimation and Clustering." (2020) Diss., Rice University. <a href="https://hdl.handle.net/1911/108374">https://hdl.handle.net/1911/108374</a>. | en_US |
dc.identifier.uri | https://hdl.handle.net/1911/108374 | en_US |
dc.language.iso | eng | en_US |
dc.rights | Copyright is held by the author, unless otherwise indicated. Permission to reuse, publish, or reproduce the work beyond the bounds of fair use or other exemptions to copyright law must be obtained from the copyright holder. | en_US |
dc.subject | Self-Organizing Maps | en_US |
dc.subject | Kernel Density Estimation | en_US |
dc.subject | Clustering | en_US |
dc.title | Learning from SOM Voronoi Tessellations: Applications to Density Estimation and Clustering | en_US |
dc.type | Thesis | en_US |
dc.type.material | Text | en_US |
thesis.degree.department | Statistics | en_US |
thesis.degree.discipline | Engineering | en_US |
thesis.degree.grantor | Rice University | en_US |
thesis.degree.level | Doctoral | en_US |
thesis.degree.major | Unsupervised Machine Learning | en_US |
thesis.degree.name | Doctor of Philosophy | en_US |
Files
Original bundle
1 - 1 of 1