Overparameterization and double descent in PCA, GANs, and Diffusion models
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
This PhD thesis constitutes a synthesis of my doctoral work, which addresses various aspects of study related to generative modeling with a particular focus on overparameterization. Using a novel method we call pseudo-supervision, we investigate approaches toward characterization of overparameterization behaviors, including double descent, of GANs as well as PCA-like problems. Extending pseudo-supervision to diffusion models, we see that it can be used to create an inductive bias; we demonstrate that this allows us to train our model with lower generalization error and faster convergence time compared to the baseline. I additionally introduce a novel method called Boomerang to extend our study of diffusion models, showing that they can be used for local sampling in image manifolds. Finally, in an approach we titled WaM, I extend FID to include non-Gaussian distributions by using a Gaussian mixture model and a bound on the 2-Wasserstein metric for Gaussian mixture models to define a metric on non-Gaussian features.
Description
Advisor
Degree
Type
Keywords
Citation
Luzi, Lorenzo. Overparameterization and double descent in PCA, GANs, and Diffusion models. (2024). PhD diss., Rice University. https://hdl.handle.net/1911/116219