Skewers, the Carnegie Classification, and the Hybrid Bootstrap

Kosar, Robert

Skewers, the Carnegie Classification, and the Hybrid Bootstrap

dc.contributor.advisor	Scott, David W	en_US
dc.creator	Kosar, Robert	en_US
dc.date.accessioned	2019-05-16T20:53:17Z	en_US
dc.date.available	2019-05-16T20:53:17Z	en_US
dc.date.created	2017-12	en_US
dc.date.issued	2017-11-30	en_US
dc.date.submitted	December 2017	en_US
dc.date.updated	2019-05-16T20:53:17Z	en_US
dc.description.abstract	Principal component analysis is an important statistical technique for dimension reduction and exploratory data analysis. However, it is not robust to outliers and may obfuscate important data structure such as clustering. We propose a version of principal component analysis based on the robust L2E method. The technique seeks to find the principal components of potentially highly non-spherical distribution components of a Gaussian mixture model. The algorithm requires neither specification of the number of clusters nor estimation of a full covariance matrix in order to run. The Carnegie classification is a decades-old (updated approximately every five years) taxonomy for research universities. However, it is based on questionable statistical methodology and suffers from a number of issues. We present a criticism of the Carnegie methodology, and offer two alternatives that are designed to be consistent with Carnegie's goals but also more statistically sound. We also present a visualization application where users can explore both the Carnegie system and our proposed systems. Preventing overfitting is an important topic in the field of machine learning, where it is common or even mundane to fit models with millions of parameters. One of the most popular algorithms for preventing overfitting is dropout. We present a drop-in replacement for dropout that offers superior performance on standard benchmark datasets and is relatively insensitive to hyperparameter choice.	en_US
dc.format.mimetype	application/pdf	en_US
dc.identifier.citation	Kosar, Robert. "Skewers, the Carnegie Classification, and the Hybrid Bootstrap." (2017) Diss., Rice University. <a href="https://hdl.handle.net/1911/105553">https://hdl.handle.net/1911/105553</a>.	en_US
dc.identifier.uri	https://hdl.handle.net/1911/105553	en_US
dc.language.iso	eng	en_US
dc.rights	Copyright is held by the author, unless otherwise indicated. Permission to reuse, publish, or reproduce the work beyond the bounds of fair use or other exemptions to copyright law must be obtained from the copyright holder.	en_US
dc.subject	Clustering	en_US
dc.subject	Principal Component Analysis	en_US
dc.subject	University Rankings	en_US
dc.subject	Regularization	en_US
dc.title	Skewers, the Carnegie Classification, and the Hybrid Bootstrap	en_US
dc.type	Thesis	en_US
dc.type.material	Text	en_US
thesis.degree.department	Statistics	en_US
thesis.degree.discipline	Engineering	en_US
thesis.degree.grantor	Rice University	en_US
thesis.degree.level	Doctoral	en_US
thesis.degree.name	Doctor of Philosophy	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: KOSAR-DOCUMENT-2017.pdf
Size:: 3.37 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 2 of 2

Name:: PROQUEST_LICENSE.txt
Size:: 5.84 KB
Format:: Plain Text
Description:

Download

Name:: LICENSE.txt
Size:: 2.61 KB
Format:: Plain Text
Description:

Download

Collections

Rice University Theses and Dissertations