Robust Discriminant Analysis and Clustering by a Partial Minimum Integrated Squared Error Criterion

dc.contributor.advisorScott, David W.en_US
dc.contributor.committeeMemberEnsor, Katherine B.en_US
dc.contributor.committeeMemberLane, David M.en_US
dc.creatorAdler, Yeshaya Adamen_US
dc.date.accessioned2019-05-16T20:11:36Zen_US
dc.date.available2019-05-16T20:11:36Zen_US
dc.date.created2017-08en_US
dc.date.issued2017-08-10en_US
dc.date.submittedAugust 2017en_US
dc.date.updated2019-05-16T20:11:36Zen_US
dc.description.abstractIn parametric supervised classification and unsupervised clustering traditional methods are often inadequate when data are generated under departures from normality assumptions. A class of density power divergences was introduced by Basu et al. (1998) to alleviate these problems. This class of estimators is indexed by a parameter α which balances efficiency versus robustness. It includes the maximum likelihood as a limiting case as α ↓ 0, and the special case known as L2E where α = 1 (Scott, 2001), which has been studied for its robustness properties. In this thesis, we develop two methods which utilize L2E estimation to perform discriminant analysis and modal clustering. Robust versions of discriminant analysis built on the Bayesian model usually supplant the maximum likelihood estimates by plugging robust alternatives into the discriminant rule. We develop robust discriminant analysis which does not rely on multiple plug-in estimates but rather jointly estimates model parameters. We apply these methods to simulated and applied cases and show them to be robust to departures from normality. In the second application, we explore the problem of obtaining all possible modes of a kernel density estimate. We introduce a clustering method based on the stochastic mode tree, originally developed in an unpublished manuscript of Scott and Szewczyk (2000). This method applies the multivariate partial density component L2E estimator, which includes maximum likelihood estimation as a limiting case, of Scott (2004) to locally probe the data and find all potential modes of a density. We provide an efficient implementation of the stochastic mode tree which is re-purposed to cluster the data according to its modal hierarchy. We explore the behavior of this clustering method with simulations and applied data. We develop an interactive exploratory visualization tool which relates the modal clustering of a density to the optimal weights of individual partial density components. We show how this method can be used to interactively prune the stochastic mode tree to obtain a desired cluster hierarchy. Finally, we show our hierarchical mode clustering to be useful in image thresholding and segmentation.en_US
dc.format.mimetypeapplication/pdfen_US
dc.identifier.citationAdler, Yeshaya Adam. "Robust Discriminant Analysis and Clustering by a Partial Minimum Integrated Squared Error Criterion." (2017) Diss., Rice University. <a href="https://hdl.handle.net/1911/105484">https://hdl.handle.net/1911/105484</a>.en_US
dc.identifier.urihttps://hdl.handle.net/1911/105484en_US
dc.language.isoengen_US
dc.rightsCopyright is held by the author, unless otherwise indicated. Permission to reuse, publish, or reproduce the work beyond the bounds of fair use or other exemptions to copyright law must be obtained from the copyright holder.en_US
dc.subjectStatisticsen_US
dc.subjectDiscriminant Analysisen_US
dc.subjectMode Findingen_US
dc.titleRobust Discriminant Analysis and Clustering by a Partial Minimum Integrated Squared Error Criterionen_US
dc.typeThesisen_US
dc.type.materialTexten_US
thesis.degree.departmentStatisticsen_US
thesis.degree.disciplineEngineeringen_US
thesis.degree.grantorRice Universityen_US
thesis.degree.levelDoctoralen_US
thesis.degree.nameDoctor of Philosophyen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
ADLER-DOCUMENT-2017.pdf
Size:
4.96 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 2 of 2
No Thumbnail Available
Name:
PROQUEST_LICENSE.txt
Size:
5.84 KB
Format:
Plain Text
Description:
No Thumbnail Available
Name:
LICENSE.txt
Size:
2.61 KB
Format:
Plain Text
Description: