Browsing by Author "Coleman, Ben Ray"
Now showing 1 - 1 of 1
Results Per Page
Sort Options
Item Kernel Sum Sketches for Large Scale Learning(2023-01-03) Coleman, Ben Ray; Shrivastava, AnshumaliKernel methods play a central role in machine learning and statistics, but algorithms for such methods scale poorly to large, high-dimensional datasets. Kernel sum computations are often the bottleneck, as they must aggregate all pairwise interactions between a query and each element of the dataset. Prior research has resulted in fast methods to approximate this sum with coresets, kernel approximations and adaptive sampling. However, existing methods still have prohibitively high memory and computation costs, especially for emerging applications in web-scale learning, genomics and streaming data. In my work, I have developed a compressed summary of the dataset, or sketch, that supports fast approximate sum queries for a special class of kernels. The sketch requires memory that is sub-linear in the data size and dimension, can be constructed in a single pass and comes with strong theoretical guarantees on the approximation error. In this thesis, I argue that kernel sum sketches are a new, useful tool for large-scale analysis and learning. I use the sketch to improve the resource-accuracy tradeoff by an order of magnitude for i) differentially private density estimation, linear regression and classification, ii) fast inverse propensity sampling and iii) memory-efficient near-neighbor search.