Introduced by Agarwal, Har-Peled, and Varadarajan (2004), an epsilon-kernel of a point set is a coreset that can be used to approximate the width, minimum enclosing cylinder, minimum bounding box, and solve various related geometric optimization problems. Such coresets form one of the most important tools in the design of linear-time approximation algorithms in computational geometry, as well as efficient insertion-only streaming algorithms and dynamic (non-streaming) data structures. In this paper, we continue the theme and explore dynamic streaming algorithms (in the so-called turnstile model).
Andoni and Nguyen [SODA'12] described a dynamic streaming algorithm for maintaining a (1+epsilon)-approximation of the width using O(polylog U) space and update time for a point set in [U]^d for any constant dimension d and any constant epsilon > 0. Their sketch, based on a polynomial method, does not explicitly maintain an epsilon-kernel. We extend their method to maintain an epsilon-kernel, and at the same time reduce some of logarithmic factors. As an application, we obtain the first randomized dynamic streaming algorithm for the width problem (and related geometric optimization problems) that supports k outliers, using poly(k, log U) space and time.
Recently, Arya, da Fonseca, and Mount [STOC 2011, SODA 2012] made notable progress in improving the epsilon-dependencies in the space/query-time tradeoffs for (1+epsilon)-factor approximate nearest neighbor search in fixed-dimensional Euclidean spaces. However, epsilon-dependencies in the preprocessing time were not considered, and so their data structures cannot be used to derive faster algorithms for offline proximity problems. Known algorithms for many such problems, including approximate bichromatic closest pair (BCP) and approximate Euclidean minimum spanning trees (EMST), typically have factors near (1/epsilon)^{d/2 +/- O(1)} in the running time when the dimension d is a constant.
We describe a technique that breaks the (1/epsilon)^{d/2} barrier and yields new results for many well-known proximity problems, including:
The improvement arises from a new time bound for exact "discrete Voronoi diagrams", which were previously used in the construction of epsilon-kernels (or extent-based coresets), a well-known tool for another class of fundamental problems. This connection leads to more results, including:
At SODA'10, Agarwal and Sharathkumar presented a streaming algorithm for approximating the minimum enclosing ball of a set of points in d-dimensional Euclidean space. Their algorithm requires one pass, uses O(d) space, and was shown to have approximation factor at most (1+sqrt{3})/2 + eps ~ 1.3661. We prove that the same algorithm has approximation factor less than 1.22, which brings us much closer to a (1+sqrt{2})/2 ~ 1.207 lower bound given by Agarwal and Sharathkumar.
We also apply this technique to the dynamic version of the minimum enclosing ball problem (in the non-streaming setting). We give an O(dn)-space data structure that can maintain a 1.22-approximate minimum enclosing ball in O(d log n) expected amortized time per insertion/deletion.
We give a dynamic data structure that can maintain an epsilon-coreset of n points, with respect to the extent measure, in O(log n) time for any constant epsilon > 0 and any constant dimension. The previous method by Agarwal, Har-Peled, and Varadarajan requires polylogarithmic update time. For points with integer coordinates bounded by U, we alternatively get O(log log U) time. Numerous applications follow, for example, on dynamically approximating the width, smallest enclosing cylinder, minimum bounding box, or minimum-width annulus. We can also use the same approach to maintain approximate k-centers in O(min{log n, log log U}) randomized amortized time for any constant k and any constant dimension. For the smallest enclosing cylinder problem, we also show that a constant-factor approximation can be maintained in O(1) randomized amortized time on the word RAM.
We study the problem of maintaining a (1+epsilon)-factor approximation of the diameter of a stream of points under the sliding window model. In one dimension, we give a simple algorithm that only needs to store O((1/epsilon) log R) points at any time, where the parameter R denotes the "spread" of the point set. This bound is optimal and improves Feigenbaum, Kannan, and Zhang's recent solution by two logarithmic factors. We then extend our one-dimensional algorithm to higher constant dimensions and, at the same time, correct an error in the previous solution. In high nonconstant dimensions, we also observe a constant-factor approximation algorithm that requires sublinear space. Related optimization problems, such as the width, are also considered in the two-dimensional case.
We speed up previous (1+epsilon)-factor approximation algorithms for a number of geometric optimization problems in fixed dimensions: diameter, width, minimum-radius enclosing cylinder, minimum-width annulus, minimum-volume bounding box, minimum-width cylindrical shell, etc. Linear time bounds were known before; we further improve the dependence of the "constants" in terms of epsilon.
We next consider the data stream model and present new (1+epsilon)-factor approximation algorithms that need only constant space for all of the above problems in any fixed dimension. Previously, such a result was known only for diameter.
Both sets of results are obtained using the core-set framework recently proposed by Agarwal, Har-Peled, and Varadarajan.
The documents contained in this directory are included by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a non-commercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.