Browsing by Author "Scott, Clayton"
Now showing 1 - 8 of 8
Results Per Page
Sort Options
Item CORT: Classification Or Regression Trees(2003-06-20) Scott, Clayton; Willett, Rebecca; Nowak, Robert David; Digital Signal Processing (http://dsp.rice.edu/)In this paper we challenge three of the underlying principles of CART, a well know approach to the construction of classification and regression trees. Our primary concern is with the penalization strategy employed to prune back an initial, overgrown tree. We reason, based on both intuitive and theoretical arguments, that the pruning rule for classification should be different from that used for regression (unlike CART). We also argue that growing a treestructured partition that is specifically fitted to the data is unnecessary. Instead, our approach to tree modeling begins with a nonadapted (fixed) dyadic tree structure and partition, much like that underlying multiscale wavelet analysis. We show that dyadic trees provide sufficient flexibility, are easy to construct, and produce near-optimal results when properly pruned. Finally, we advocate the use of a negative log-likelihood measure of empirical risk. This is a more appropriate empirical risk for non-Gaussian regression problems, in contrast to the sum-of-squared errors criterion used in CART regression.Item CORT: Classification Or Regression Trees(2003-04-20) Scott, Clayton; Willett, Rebecca; Nowak, Robert David; Digital Signal Processing (http://dsp.rice.edu/)In this paper we challenge three of the underlying principles of CART, a well know approach to the construction of classification and regression trees. Our primary concern is with the penalization strategy employed to prune back an initial, overgrown tree. We reason, based on both intuitive and theoretical arguments, that the pruning rule for classification should be different from that used for regression (unlike CART). We also argue that growing a treestructured partition that is specifically fitted to the data is unnecessary. Instead, our approach to tree modeling begins with a nonadapted (fixed) dyadic tree structure and partition, much like that underlying multiscale wavelet analysis. We show that dyadic trees provide sufficient flexibility, are easy to construct, and produce near-optimal results when properly pruned. Finally, we advocate the use of a negative log-likelihood measure of empirical risk. This is a more appropriate empirical risk for non-Gaussian regression problems, in contrast to the sum-of-squared errors criterion used in CART regression.Item Dyadic decision trees(2004) Scott, Clayton; Nowak, Robert D.This thesis introduces a new family of classifiers called dyadic decision trees (DDTs) and develops their theoretical properties within the framework of statistical learning theory. First, we show that DDTs achieve optimal rates of convergence for a broad range of classification problems and are adaptive in three important respects: They automatically (1) adapt to favorable conditions near the Bayes decision boundary; (2) focus on data distributed on lower dimensional manifolds; and (3) reject irrelevant features. DDTs are selected by penalized empirical risk minimization using a new data-dependent penalty and may be computed exactly and efficiently. DDTs are the first practical classifier known to achieve optimal rates for the diverse class of distributions studied here. This is also the first study (of which we are aware) to consider rates for adaptation to data dimension and relevant features. Second, we develop the theory of statistical learning using the Neyman-Pearson (NP) criterion. It is shown that concepts from learning with a Bayes error criterion have counterparts in the NP context. Thus, we consider constrained versions of empirical risk minimization and structural risk minimization (NP-SRM), proving performance guarantees for both. We also provide a general condition under which NP-SRM leads to strong universal consistency. Finally, we apply NP-SRM to dyadic decision trees, deriving rates of convergence and providing an explicit algorithm to implement NP-SRM in this setting. Third, we study the problem of pruning a binary tree by minimizing an objective function that sums an additive cost with a non-additive penalty depending only on tree size. We focus on sub-additive penalties which are motivated by theoretical results for dyadic and other decision trees. Consider the family of optimal prunings generated by varying the scalar multiplier of a sub-additive penalty. We show this family is a subset of the analogous family produced by an additive penalty. This implies (by known results for additive penalties) that the trees generated by a sub-additive penalty (1) are nested; (2) are unique; and (3) can be computed efficiently. It also implies that an additive penalty is preferable when using cross-validation to select from the family of possible prunings.Item A Hierarchical Wavelet-Based Framework for Pattern Analysis and Synthesis(2000-04-20) Scott, Clayton; Center for Multimedia Communications (http://cmc.rice.edu/)Despite their success in other areas of statsitical signal processing, current wavelet-based image models are inadequate for modeling patterns in images, due to the presence of unknown transformations inherent in most pattern observations. In this thesis we introduce a hierarchical wavelet-based framework for modeling patterns in digital images. This framework takes advantage of the efficient image representations afforded by wavelets, while accounting for unknown pattern transformations. Given a trained model, we can use this framework to synthesize pattern observations. If the model parameters are unknown, we can infer them from labeled training data using TEMPLAR, a novel template learning algorithm with linear complexity. TEMPLAR employs minimum description length (MDL) complexity regularization to learn a template with a sparse representation in the wavelet domain. If we are given several trained models for different patterns, our framework provides a low-dimensional subspace classifier that is invariant to unknown pattern transformations as well as background clutter.Item Hierarchical Wavelet-Based Image Model for Pattern Analysis and Synthesis(2000-07-20) Scott, Clayton; Nowak, Robert David; Center for Multimedia Communications (http://cmc.rice.edu/); Digital Signal Processing (http://dsp.rice.edu/)Despite their success in other areas of statistical signal processing, current wavelet-based image models are inadequate for modeling patterns in images, due to the presence of unknown transformations (e.g., translation, rotation, scaling) inherent in most pattern observations. In this paper we introduce a hierarchical wavelet-based framework for modeling patterns in digital images. This framework takes advantage of the efficient image representations afforded by wavelets, while accounting for unknown pattern transformations. Given a trained model, we can use this framework to synthesize pattern observations. If the model parameters are unknown, we can infer them from labeled training data using TEMPLAR (Template Learning from Atomic Representations), a novel template learning algorithm with linear complexity. TEMPLAR employs minimum description length (MDL) complexity regularization to learn a template with a sparse representation in the wavelet domain. We illustrate template learning with examples, and discuss how TEMPLAR applies to pattern classification and denoising from multiple, unaligned observations.Item Statistical Signal Processing(Rice University, 2013-12-05) Scott, ClaytonTo learn Statistical Signal Processing.Item TEMPLAR: A Wavelet-Based Framework for Pattern Learning and Analysis(2001-04-20) Scott, Clayton; Nowak, Robert David; Digital Signal Processing (http://dsp.rice.edu/)Despite the success of wavelet decompositions in other areas of statistical signal and image processing, current wavelet-based image models are inadequate for modeling patterns in images, due to the presence of unknown transformations (e.g., translation, rotation, location of lighting source) inherent in most pattern observations. In this paper we introduce a hierarchical wavelet-based framework for modeling patterns in digital images. This framework takes advantage of the efficient image representations afforded by wavelets, while accounting for unknown pattern transformations. Given a trained model, we can use this framework to synthesize pattern observations. If the model parameters are unknown, we can infer them from labeled training data using TEMPLAR (Template Learning from Atomic Representations), a novel template learning algorithm with linear complexity. TEMPLAR employs minimum description length (MDL) complexity regularization to learn a template with a sparse representation in the wavelet domain. We discuss several applications, including template learning, pattern classification, and image registration.Item Template Learning from Atomic Representations: A Wavelet-Based Approach to Pattern Analysis(2001-04-20) Scott, Clayton; Nowak, Robert David; Digital Signal Processing (http://dsp.rice.edu/)Despite the success of wavelet decompositions in other areas of statistical signal and image processing, current wavelet-based image models are inadequate for modeling patterns in images, due to the presence of unknown transformations (e.g., translation, rotation, location of lighting source) inherent in most pattern observations. In this paper we introduce a hierarchical wavelet-based framework for modeling patterns in digital images. This framework takes advantage of the efficient image representations afforded by wavelets, while accounting for unknown pattern transformations. Given a trained model, we can use this framework to synthesize pattern observations. If the model parameters are unknown, we can infer them from labeled training data using TEMPLAR (Template Learning from Atomic Representations), a novel template learning algorithm with linear complexity. TEMPLAR employs minimum description length (MDL) complexity regularization to learn a template with a sparse representation in the wavelet domain. We discuss several applications, including template learning, pattern classification, and image registration.