Copyright © 2010 Lijun Chen et al. This is an open access article distributed under the
Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Abstract
Skyline query computes all the “best” elements which are
not dominated by any other elements and thus is very important
for decision-making applications. Recently, it is generalized
to skyband query and a k-skyband query returns
those elements dominated by no more than k, of other elements.
To incorporate the skyband operator into the stream engine
for monitoring skybands over sliding windows, space usage
estimation for skyband operator becomes a critical issue in
the query optimizer. In this paper, we firstly introduce the
skyband sketch as the cost model. Based on the cost model,
we propose an approach for estimating the space usage of
skyband operator over sliding windows of data streams under
the assumptions of statistical independence across dimensions,
no duplicate values over each dimension, and dimension
domains totally ordered. Experiments verify that
our approaches can estimate the space usage effectively over
arbitrarily distributed data. To the best of our knowledge,
this is the first work that attempts to address the issue and
proposes effective approaches to solve it.