Hicham
El-Zein,
PhD
candidate
David
R.
Cheriton
School
of
Computer
Science
We present succinct data structures for one-dimensional color reporting and color counting problems. We are given a set of $n$ points with integer coordinates in the range $[1,m]$ and every point is assigned a color from the set $\{\,1,\ldots,\sigma\,\}$. A color reporting query asks for the list of distinct colors that occur in a query interval $[a,b]$ and a color counting query asks for the number of distinct colors in $[a,b]$.
We describe a succinct data structure that answers approximate color counting queries in $O(1)$ time and uses $\mathcal{B}(n,m) + O(n) + o(\mathcal{B}(n,m))$ bits, where $\mathcal{B}(n,m)$ is the minimum number of bits required to represent an arbitrary set of size $n$ from a universe of $m$ elements. Thus we show, somewhat counterintuitively, that it is not necessary to store colors of points in order to answer approximate color counting queries. In the special case when points are in the rank space (i.e., when $n=m$), our data structure needs only $O(n)$ bits. Also, we show that $\Omega(n)$ bits are necessary in that case.
Then we turn to succinct data structures for color reporting. We describe a data structure that uses $\mathcal{B}(n,m) + nH_d(S) + o(\mathcal{B}(n,m)) + o(n\lg\sigma)$ bits and answers queries in $O(k+1)$ time, where $k$ is the number of colors in the answer, and $nH_d(S)$ ($d=\log_{\sigma} n$) is the $d$-th order empirical entropy of the color sequence. Finally, we consider succinct color reporting under restricted updates.
Our dynamic data structure uses $nH_d(S)+o(n\lg\sigma)$ bits and supports queries in $O(k+1)$ time.