Biased and Unbiased Cross-Validation in Density Estimation

Scott, David W.; Terrell, George R.

Biased and Unbiased Cross-Validation in Density Estimation

dc.contributor.author	Scott, David W.	en_US
dc.contributor.author	Terrell, George R.	en_US
dc.date.accessioned	2018-06-18T17:27:37Z	en_US
dc.date.available	2018-06-18T17:27:37Z	en_US
dc.date.issued	1987-02	en_US
dc.date.note	February 1987	en_US
dc.description.abstract	Non parametric density estimation requires the specification of smoothing parameters. The demand of statistical objectivity make it highly desirable to base the choice on properties of the data set. In this paper we introduce some biased cross-validation criteria for selection of smoothing parameters for kernel and histogram density estimators, closely related to one investigated in Scott and Factor (1981). These criteria are obtained by estimating L2-norms of derivatives of the unknown density and provide slightly biased estimates of the average squared-L2 error or mean integrated squared error. These criteria are roughly the analog of Wahba's (1981) generalized cross-validation procedure for orthogonal series density estimators. We present the relationship of the biased cross-validation procedure to the least squares cross-validation procedure, which provides unbiased estimates of the mean integrated squared error. Both methods are shown to be based on U-statistics. We compare the two methods by theoretical calculation of the noise in the cross-validation functions and corresponding cross-validated smoothing parameters, by Monte Carlo simulation, and by example. Surprisingly large gains in asymptotic efficiency are observed when biased cross-validation is compared to unbiased cross-validation if the underlying density is sufficiently smooth. The theoretical results explain some of the small sample behavior of cross-validation functions: we show that cross-validation algorithms can be unreliable for samples sizes which are "too small." In order to aid the practitioner in the use of these appealing automatic cross-validation algorithms and to help facilitate evaluation of future algorithms, we must address some ofttimes controversial issues in density estimation: squared loss, the integrate squared error and mean integrated squared error criteria, adaptive density estimates, sample size requirements, and assumptions about the underlying density's smoothness. We conclude that the two cross-validation procedures behave quite differently so that one might well use both in practice.	en_US
dc.format.extent	45 pp	en_US
dc.identifier.citation	Scott, David W. and Terrell, George R.. "Biased and Unbiased Cross-Validation in Density Estimation." (1987) <a href="https://hdl.handle.net/1911/101613">https://hdl.handle.net/1911/101613</a>.	en_US
dc.identifier.digital	TR87-02	en_US
dc.identifier.uri	https://hdl.handle.net/1911/101613	en_US
dc.language.iso	eng	en_US
dc.title	Biased and Unbiased Cross-Validation in Density Estimation	en_US
dc.type	Technical report	en_US
dc.type.dcmi	Text	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: TR87-02.pdf
Size:: 669.21 KB
Format:: Adobe Portable Document Format

Download

Collections

CMOR Technical Reports