Seminar • Algorithms and Complexity — New Nearly-Optimal Coreset for Kernel Density Estimation

Wednesday, September 9, 2020 10:00 am - 10:00 am EDT (GMT -04:00)

Please note: This seminar will be given online.

Wai Ming Tai, School of Computing
University of Utah

Seminar host: Professor Lap Chi Lau

Given a point set $P \subset \mathbb{R}^d$, kernel density estimation for Gaussian kernel is defined as $\overline{\mathcal{G}}_P(x) = \frac{1}{\left|P\right|}\sum_{p\inP}e^{-\left\lVert x-p \right\rVert^2}$ for any $x\in\mathbb{R}^d$. We study how to construct a small subset $Q$ of $P$ such that the kernel density estimation of $P$ can be approximated by the kernel density estimation of $Q$. This subset $Q$ is called \emph{coreset}. The primary technique in this work is to construct $\pm 1$ coloring on the point set $P$ by the discrepancy theory and apply this coloring algorithm recursively. 

Our result leverages Banaszczyk’s Theorem. When $d>1$ is constant, our construction gives a coreset of size $O\left(\frac{1}{\varepsilon}\sqrt{\log\log\frac{1}{\varepsilon}}\right)$as opposed to the best-known result of $O\left(\frac{1}{\varepsilon}\sqrt{\log\frac{1}{\varepsilon}}\right)$. It is the first to give a breakthrough on the barrier of $\sqrt{\log}$ factor even when $d=2$.


To join this seminar virtually on Zoom, please go to https://zoom.us/j/96151296749.

Meeting ID: 961 5129 6749