Learning minimum volume sets with support vector machines

Davenport, Mark A.; Baraniuk, Richard G.; Scott, Clayton D.

Learning minimum volume sets with support vector machines

Files

Dav2006Sep5Learningmi.PDF (713.91 KB)

Date

2006-09-01

Authors

Davenport, Mark A.

Baraniuk, Richard G.

Scott, Clayton D.

Abstract

Given a probability law P on d-dimensional Euclidean space, the minimum volume set (MV-set) with mass beta , 0 < beta < 1, is the set with smallest volume enclosing a probability mass of at least beta. We examine the use of support vector machines (SVMs) for estimating an MV-set from a collection of data points drawn from P, a problem with applications in clustering and anomaly detection. We investigate both one-class and two-class methods. The two-class approach reduces the problem to Neyman-Pearson (NP) classification, where we artificially generate a second class of data points according to a uniform distribution. The simple approach to generating the uniform data suffers from the curse of dimensionality. In this paper we (1) describe the reduction of MV-set estimation to NP classification, (2) devise improved methods for generating artificial uniform data for the two-class approach, (3) advocate a new performance measure for systematic comparison of MV-set algorithms, and (4) establish a set of benchmark experiments to serve as a point of reference for future MV-set algorithms. We find that, in general, the two-class method performs more reliably.

Description

Conference Paper

Type

Conference paper

Citation

M. A. Davenport, R. G. Baraniuk and C. D. Scott, "Learning minimum volume sets with support vector machines," 2006.