Dyadic decision trees

Date
2004
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract

This thesis introduces a new family of classifiers called dyadic decision trees (DDTs) and develops their theoretical properties within the framework of statistical learning theory. First, we show that DDTs achieve optimal rates of convergence for a broad range of classification problems and are adaptive in three important respects: They automatically (1) adapt to favorable conditions near the Bayes decision boundary; (2) focus on data distributed on lower dimensional manifolds; and (3) reject irrelevant features. DDTs are selected by penalized empirical risk minimization using a new data-dependent penalty and may be computed exactly and efficiently. DDTs are the first practical classifier known to achieve optimal rates for the diverse class of distributions studied here. This is also the first study (of which we are aware) to consider rates for adaptation to data dimension and relevant features. Second, we develop the theory of statistical learning using the Neyman-Pearson (NP) criterion. It is shown that concepts from learning with a Bayes error criterion have counterparts in the NP context. Thus, we consider constrained versions of empirical risk minimization and structural risk minimization (NP-SRM), proving performance guarantees for both. We also provide a general condition under which NP-SRM leads to strong universal consistency. Finally, we apply NP-SRM to dyadic decision trees, deriving rates of convergence and providing an explicit algorithm to implement NP-SRM in this setting. Third, we study the problem of pruning a binary tree by minimizing an objective function that sums an additive cost with a non-additive penalty depending only on tree size. We focus on sub-additive penalties which are motivated by theoretical results for dyadic and other decision trees. Consider the family of optimal prunings generated by varying the scalar multiplier of a sub-additive penalty. We show this family is a subset of the analogous family produced by an additive penalty. This implies (by known results for additive penalties) that the trees generated by a sub-additive penalty (1) are nested; (2) are unique; and (3) can be computed efficiently. It also implies that an additive penalty is preferable when using cross-validation to select from the family of possible prunings.

Description
Degree
Doctor of Philosophy
Type
Thesis
Keywords
Electronics, Electrical engineering
Citation

Scott, Clayton. "Dyadic decision trees." (2004) Diss., Rice University. https://hdl.handle.net/1911/18814.

Has part(s)
Forms part of
Published Version
Rights
Copyright is held by the author, unless otherwise indicated. Permission to reuse, publish, or reproduce the work beyond the bounds of fair use or other exemptions to copyright law must be obtained from the copyright holder.
Link to license
Citable link to this page