Learning from SOM Voronoi Tessellations: Applications to Density Estimation and Clustering

dc.contributor.advisorMerenyi, Erzsebeten_US
dc.creatorTaylor, Joshen_US
dc.date.accessioned2020-04-27T18:45:01Zen_US
dc.date.available2020-04-27T18:45:01Zen_US
dc.date.created2020-05en_US
dc.date.issued2020-04-24en_US
dc.date.submittedMay 2020en_US
dc.date.updated2020-04-27T18:45:01Zen_US
dc.description.abstractThis work collectively demonstrates how a study of the geometry of the Voronoi tessellations induced by manifold learning via Self-Organizing Maps (SOMs) can inform the solution of two standard problems in unsupervised learning: density estimation and clustering. In Part I we introduce methodology based on SOMs for specifying locally adaptive bandwidths for non-parametric kernel density estimation. Bandwidth selection for this new estimator, dubbed VECS (Voronoi Ellipsoidal Coordinate Smoothing), exploits both the CADJ graph of SOM prototypes and the geometries of the Voronoi regions it connects; this helps VECS avoid the computational issues stemming from minimization of the traditional error functionals used in density estimation, making it suitable for use in higher dimensions. Combining VECS estimates with the organization of the SOM output lattice offers a new visualization for high dimensional exploratory data analysis called VECSvis. In Part II we develop DMPrune, an automated method for removing edges of a CADJ graph based on a Dirichlet-Multinomial model of its edge weights. As a topology representing graph, CADJ guides the most sophisticated SOM-based clusterings. DMPrune offers intelligent graph pruning by assigning a score to each CADJ edge and, consulting their likelihood, monitors the impacts of edge removal to maximize edge sparsity while minimizing loss of cohesion in the most important parts of the graph. This brings us closer to high quality, automated cluster extraction from a learned SOM. For exposition, we exercise both methodologies on MER, a 7-band multispectral image of the Columbia Hills region of the Martian landscape imaged by the Mars Exploratoration Rover Spirit. Automated clusterings of the image spectra parameterized by DMPrune are shown to be of comparable quality to a previous, scientifically verified, MER clustering guided by human interaction. Additional structure appearing from these automated clusterings appears valid when compared to the VECSvis of this image.en_US
dc.format.mimetypeapplication/pdfen_US
dc.identifier.citationTaylor, Josh. "Learning from SOM Voronoi Tessellations: Applications to Density Estimation and Clustering." (2020) Diss., Rice University. <a href="https://hdl.handle.net/1911/108374">https://hdl.handle.net/1911/108374</a>.en_US
dc.identifier.urihttps://hdl.handle.net/1911/108374en_US
dc.language.isoengen_US
dc.rightsCopyright is held by the author, unless otherwise indicated. Permission to reuse, publish, or reproduce the work beyond the bounds of fair use or other exemptions to copyright law must be obtained from the copyright holder.en_US
dc.subjectSelf-Organizing Mapsen_US
dc.subjectKernel Density Estimationen_US
dc.subjectClusteringen_US
dc.titleLearning from SOM Voronoi Tessellations: Applications to Density Estimation and Clusteringen_US
dc.typeThesisen_US
dc.type.materialTexten_US
thesis.degree.departmentStatisticsen_US
thesis.degree.disciplineEngineeringen_US
thesis.degree.grantorRice Universityen_US
thesis.degree.levelDoctoralen_US
thesis.degree.majorUnsupervised Machine Learningen_US
thesis.degree.nameDoctor of Philosophyen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
TAYLOR-DOCUMENT-2020.pdf
Size:
40.23 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 2 of 2
No Thumbnail Available
Name:
PROQUEST_LICENSE.txt
Size:
5.83 KB
Format:
Plain Text
Description:
No Thumbnail Available
Name:
LICENSE.txt
Size:
2.6 KB
Format:
Plain Text
Description: