Learning the Structure of High-Dimensional Manifolds with Self-Organizing Maps for Accurate Information Extraction

Zhang, Lili

Learning the Structure of High-Dimensional Manifolds with Self-Organizing Maps for Accurate Information Extraction

dc.contributor.advisor	Merenyi, Erzsebet	en_US
dc.creator	Zhang, Lili	en_US
dc.date.accessioned	2011-08-16T18:32:29Z	en_US
dc.date.available	2011-08-16T18:32:29Z	en_US
dc.date.issued	2011	en_US
dc.description	This paper was submitted by the author prior to final official version. For official version please see http://hdl.handle.net/1911/70515	en_US
dc.description.abstract	This work aims to improve the capability of accurate information extraction from high-dimensional data, with a specific neural learning paradigm, the Self-Organizing Map (SOM). The SOM is an unsupervised learning algorithm that can faithfully sense the manifold structure and support supervised learning of relevant information from the data. Yet open problems regarding SOM learning exist. We focus on the following two issues. 1. Evaluation of topology preservation. Topology preservation is essential for SOMs in faithful representation of manifold structure. However, in reality, topology violations are not unusual, especially when the data have complicated structure. Measures capable of accurately quantifying and informatively expressing topology violations are lacking. One contribution of this work is a new measure, the Weighted Differential Topographic Function (WDTF), which differentiates an existing measure, the Topographic Function (TF), and incorporates detailed data distribution as an importance weighting of violations to distinguish severe violations from insignificant ones. Another contribution is an interactive visual tool, TopoView, which facilitates the visual inspection of violations on the SOM lattice. We show the effectiveness of the combined use of the WDTF and TopoView through a simple two-dimensional data set and two hyperspectral images. 2. Learning multiple latent variables from high-dimensional data. We use an existing two-layer SOM-hybrid supervised architecture, which captures the manifold structure in its SOM hidden layer, and then, uses its output layer to perform the supervised learning of latent variables. In the customary way, the output layer only uses the strongest output of the SOM neurons. This severely limits the learning capability. We allow multiple, k, strongest responses of the SOM neurons for the supervised learning. Moreover, the fact that different latent variables can be best learned with different values of k motivates a new neural architecture, the Conjoined Twins, which extends the existing architecture with additional copies of the output layer, for preferential use of different values of k in the learning of different latent variables. We also automate the customization of k for different variables with the statistics derived from the SOM. The Conjoined Twins shows its effectiveness in the inference of two physical parameters from Near-Infrared spectra of planetary ices.	en_US
dc.format.extent	154	en_US
dc.format.mimetype	application/pdf	en_US
dc.identifier.citation	Zhang, Lili. "Learning the Structure of High-Dimensional Manifolds with Self-Organizing Maps for Accurate Information Extraction." (2011) Rice University: <a href="https://hdl.handle.net/1911/62283">https://hdl.handle.net/1911/62283</a>.	en_US
dc.identifier.uri	https://hdl.handle.net/1911/62283	en_US
dc.language.iso	eng	en_US
dc.publisher	Rice University	en_US
dc.rights	Copyright is held by the author	en_US
dc.subject	Information extraction	en_US
dc.subject	Self-organizing maps	en_US
dc.subject	High-dimensional data	en_US
dc.subject	Neural networks	en_US
dc.subject	Manifold learning	en_US
dc.title	Learning the Structure of High-Dimensional Manifolds with Self-Organizing Maps for Accurate Information Extraction	en_US
dc.type.dcmi	Text	en_US
dc.type.genre	Thesis	en_US
thesis.degree.department	Applied Physics	en_US
thesis.degree.discipline	Natural Sciences	en_US
thesis.degree.grantor	Rice University	en_US
thesis.degree.level	Doctoral	en_US
thesis.degree.name	Doctor of Philosophy	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: PhD_thesis_LiliZhang_Apri062011_signedtitlepage.pdf
Size:: 5.52 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.79 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Rice Graduate Student Collection