CS-2003-01 | ||||
---|---|---|---|---|
Title | Processing Sliding Window Multi-Joins in Continuous Queries | |||
Authors | Lukasz Golab and M. Tamer Ozsu | |||
Abstract | We study sliding window multi-join processing in continuous queries over data streams. Several algorithms are reported for performing continuous, incremental joins, under assumption that all the sliding windows fit in main memory. The algorithms include multi way incremental nested loop joins (NLJs) and multi-way incremental hash joins. We also propose join ordering heuristics to minimize the processing cost per unit time. We test a possible implementation of these algorithms and show that as expected, has joins are faster than NLJs for performing equi-joins, and that the overall processing cost is influenced by the strategies used to remove expired tuples from the sliding windows | |||
Date | February 2003 | |||
Report | Processing Sliding Window Multi-Joins in Continuous Queries (PDF) |
CS-2003-04 | ||||
---|---|---|---|---|
Title | A Roadmap to a Medeling Language for Multi-Agent Systems Engineering | |||
Authors | Viviane Silva, Carlos Lucena, Paulo Alencar and Donald Cowan | |||
Abstract | In this paper we present a conceptual framework called TAO that organizes the core concepts related to multi-agent systems and assumes that agents and objects co-exist as independent and unique abstractions. We then use the TAO framework to create the multi-agent system modelling language MAS-ML, which extends UML (Unified Modelling Language). MAS-ML is defined as a conservative extension of the UML metamodel, and includes agent-related notions that are part of the TAO conceptual framework while preserving all object-related concepts, which constitute the UML metamodel. Creating a modelling language for multi-agent systems is a complex task and we should consider following a process or roadmap similar to the one that led to the development of the UML and its various sub-models and supporting facilities. | |||
Date | February 2003 | |||
Report | A Roadmap to a Medeling Language for Multi-Agent Systems Engineering (PDF) |
CS-2003-06 | ||||
---|---|---|---|---|
Title | Towards Identifying Frequent Items in Sliding Windows | |||
Authors | David DeHaan, Erik D. Demaine, Lukasz Golab, Alejandro Lopez-Ortiz and J. Ian Munro | |||
Abstract | Queries that return a list of frequently occurring items are popular in the analysis of data streams such as real-time Internet traffic logs. While several resultsexist for computing frequent item queries using limited memory in the infinite stream model, none have been extended to the limited-memory sliding window model, which considers only the last N items that have arrived at any given time and forbids the storage of the entire window in memory. We present several algorithms for identifying frequent items in sliding windows, both under arbitrary distributions and assuming that each window conforms to a multinomial distribution. The former is a straightforward extension of existing algorithms and is shown to work well when tested on real-life TCP traffic logs. Our algorithms for the multinomial distribution are shown to outperform classical inference based on random sampling from the sliding window, but lose their accuracy as predictors of item frequencies when the underlying distribution is not multinomial. | |||
Date | March 2003 | |||
Report | Towards Identifying Frequent Items in Sliding Windows (PDF) |
CS-2003-10 | ||||
---|---|---|---|---|
Title | An Efficient Bounds Consistency Algorithm for the Global Cardinality Constraint | |||
Authors | Claude-Guy Quimper, Peter van Beek, Alejandro Lopez-Ortiz, Alexander Golynski and Sayyed Bashir Sadjad | |||
Abstract | Previous studies have demonstrated that designing special purpose constraint propagators can significantly improve the efficiency of a constraint programming approach. In this paper we present an efficient algorithm for bounds consistency propagation of the generalized cardinality constraint "gcc". Using a variety of benchmark and random problems, we show that our bounds consistency algorithm is competitive with and can dramatically outperform existing state-of-the-art commercial implementations of constraint propagators for the gcc. We also present a new algorithm for domain consistency propagation of the gcc which improves on the worst-case performance of the best previous algorithm for problems that occur often in applications. | |||
Date | February 2003 | |||
Report | An Efficient Bounds Consistency Algorithm for the Global Cardinality Constraint (PS) | An Efficient Bounds Consistency Algorithm for the Global Cardinality Constraint (PDF) |
CS-2003-21 | ||||
---|---|---|---|---|
Title | A Geometric B-Spline Over the Triangular Domain | |||
Authors | Christopher K. Ingram | |||
Abstract |
For
modelling
curves,
B-splines
[3]
are
among
the
most
versatile
control
schemes.
However,
scaling
this
technique
to
surface
patches
has
proven
to
be
a
non-trivial
endeavor.
While
a
suitable
scheme
exists
for
rectangular
patches
in
the
form
of
tensor
product
B-splines,
techniques
involving
the
triangular
domain
are
much
less
spectacular. The current cutting edge in triangular B-splines is the DMS-spline [2]. While the resulting surfaces possess high degrees of continuity, the control scheme is awkward and the evaluation is computationally expensive. A more fundamental problem is the construction bears little resemblance to the construction used for the B-Spline. This deciency leads to the central idea of the thesis; what happens if the simple blending functions found at the heart of the B-Spline construction are used over higher dimension domains? In this thesis I develop a geometric generalization of B-Spline curves over the triangular domain. This construction mimics the control point blending that occurs with uniform B-Splines. The construction preserves the simple control scheme and evaluation of B-Splines, without the immense computational requirements of DMSsplines. The result is a new patch control scheme, the G-Patch, possessing C0 continuity between adjacent patches. | |||
Date | June 2003 | |||
Report | A Geometric B-Spline Over the Triangular Domain (PDF) |
CS-2003-22 | ||||
---|---|---|---|---|
Title | Offset Surface Light Fields | |||
Authors | Jason Ang | |||
Abstract | For producing realistic images, reflection is an important visual effect. Reflections of the environment are important not only for highly reflective objects, such as mirrors, but also for more common objects such as brushed metals and glossy plastics. Generating these reflections accurately at real-time rates for interactive applications, however, is a difficult problem. Previous works in this area have made assumptions that sacrifice accuracy in order to preserve interactivity. I will present an algorithm that tries to handle reflection accurately in the general case for real-time rendering. The algorithm uses a database of prerendered environment maps to render both the original object itself and an additional bidirectional reflection distribution function (BRDF). The algorithm performs image-based rendering in reflection space in order to achieve accurate results. It also uses graphics processing unit (GPU) features to accelerate rendering. | |||
Date | August 2003 | |||
Report | Offset Surface Light Fields (PDF) |
CS-2003-24 | ||||
---|---|---|---|---|
Title | Evaluation of DBMSs Using XBench Benchmark | |||
Authors | M. Tamer Ozsu and Benjamin B. Yao | |||
Abstract |
XML
support
is
being
added
to
existing
database
management
systems
(DBMSs)
and
native
XML
systems
are
being
developed
both
in
industry
and
in
academia.
The
individual
performance
characteristics
of
these
approaches
as
well
as
the
relative
performance
of
various
systems
is
an
ongoing
concern.
XBench is a family of XML benchmarks which recognizes that the XML data that DBMSs manage are quite varied and no one database schema and workload can properly capture this variety. Thus, the members of this benchmark family have been defined for capturing diverse application domains. In this report we briefly discuss the XBench XML benchmark and report on the relative performance of three commercial DBMSs: X-Hive, DB2 and SQL Server. We also describe the potential issues when storing XML documents into relational DBMSs. | |||
Date | August 2003 | |||
Report | Evaluation of DBMSs Using XBench Benchmark (PDF) |
CS-2003-25 | ||||
---|---|---|---|---|
Title | Investigations in Tree Locking for Compiled Database Applications | |||
Authors | Heng YU and Grant Weddell | |||
Abstract | We report on initial experiments in tree locking schemes for compiled database applications. Such applications have a repository style of architecture in which a collection of software modules or subsystems operate on a common database in terms of a predefined set of transaction types, and are very often at the core of embedded systems. Since the tree locking protocol is deadlock free, it becomes possible to decouple recovery mechanisms from concurrency control, a property that we believe is critical to the successful deployment of database technology to this new application area. Our experiments show that the performance of tree locking can compete with two phase locking for cases such asmthe above in which a great deal can be known at the time of system generation about workloads. | |||
Date | September 2003 | |||
Report | Investigations in Tree Locking for Compiled Database Applications (PDF) |
CS-2003-26 | ||||
---|---|---|---|---|
Title | A Study on Atmospheric Halo Visualization | |||
Authors | Sung Min Hong and Gladimir Baranoski | |||
Abstract | In this report, the atmospheric halo phenomena have been explained and current trends of the halo simulation have been described. Then A ray tracer based on the Monte Carlo method has been presented and demonstrated. The comparison of the simulation results with the photos taken for the same phenomena has shown that the ray tracer gives reasonable results. This ray tracer provides a very useful tool for simulating the halo phenomena and identifying atmospheric data like proportions of ice crystals and orientations. With some enhancements of the ray tracer, which are future works, the ray tracer is expected to give physically correct realistic halo images. | |||
Date | September 2003 | |||
Report | A Study on Atmospheric Halo Visualization (PDF) |
CS-2003-27 | ||||
---|---|---|---|---|
Title | Supporting Set-at-a-time Extensions for XML through DOM | |||
Authors | Hai (Helena) Chen, Frank Tompa | |||
Abstract |
With
the
rapid
growth
of
the
web
and
e-commerce,
W3C
produced
a
new
standard,
XML,
as
a
universal
data
representation
format
to
facilitate
information
interchange
and
integration
from
heterogeneous
systems.
In
order
to
process
XML,
documents
need
to
be
parsed
and
then
users
can
access,
manipulate,
and
retrieve
the
XML
data
easily.
DOM
is
one
of
the
major
application
programming
interfaces,
which
provides
the
abstract,
logical
tree
structure
of
an
XML
document.
Though
it
is
a
very
promising
initiative
in
defining
the
standard
interface
for
XML
documents
to
access
all
data
required
by
applications,
significant
benefit
can
result
if
some
functions
are
extended.
In
this
thesis,
we
want
to
support
set-at-a-time
extensions
for
XML
through
DOM.
Through
extending
some
powerful
functions
to
get
summary
information
from
different
groups
of
related
document
elements,
and
filter,
extract
and
transform
this
information
as
a
sequence
of
nodes,
the
extended
DOM
can
reduce
the
communications
overhead
and
response
time
between
the
client
and
the
server
and
therefore
provide
applications | |||
Date | September 2003 | |||
Report | Supporting Set-at-a-time Extensions for XML through DOM (PDF) |
CS-2003-28 | ||||
---|---|---|---|---|
Title | On the Relevance of Self-Similarity in Network Traffic Prediction | |||
Authors | Majid Ghaderi | |||
Abstract |
Self-similarity
is
an
important
characteristic
of
traffic
in
high
speed
networks
which
can
not
be
captured
by
traditional
traffic
models.
Traffic
predictors
based
on
non-traditional
long-memory
models
are
computationally
more
complex
than
traditional
predictors
based
on
short-memory
models.
Even
online
estimation
of
their
parameters
for
actual
traffic
traces
is
not
trivial
work.
Based
on
the
observation
that
the
Hurst
parameter
of
real
traffic
traces
rarely
exceeds
0.85,
which
means
that
real
traffic
does
not
exhibit
strong
long-range
dependence,
and
the
fact
that
infinite
history
is
not
possible
in
practice,
we
propose
to
use
a
simple
non-model-based
minimum
mean
square
error
predictor.
In this paper, we look at the problem of traffic prediction in the presence of self-similarity. We briefly describe a number of short-memory and long-memory stochastic traffic models and talk about non-model-based predictors, particularly minimum mean square error and its normalized version. Numerical results of our experimental comparison between the so-called fractional predictors and the simple minimum mean square error predictor show that this simple method can achieve accuracy within $5\%$ of the best fractional predictor while it is much simpler than any model-based predictor and is easily used in an on-line fashion. | |||
Date | October 2003 | |||
Report | On the Relevance of Self-Similarity in Network Traffic Prediction (PS) | On the Relevance of Self-Similarity in Network Traffic Prediction (PDF) |
Compressed
PostScript: On the Relevance of Self-Similarity in Network Traffic Prediction (GZIP) |
CS-2003-29 | ||||
---|---|---|---|---|
Title | On Indexing Sliding Windows over On-line Data Streams | |||
Authors | Lukasz Golab,Shaveen Garg and M. Tamer Ozsu | |||
Abstract | We consider indexing sliding windows in main memory over on-line data streams. Our proposed data structures and query semantics are based on a division of the sliding window into sub-windows. When a new sub-window fills up with newly arrived tuples, the oldest sub-window is evicted, indices are refreshed, and continuous queries are re-evaluated to reflect the new state of the window. By classifying relational operators according to their method of execution in the windowed scenario, we show that many useful operators require access to the entire window, motivating the need for two types of indices: those which provide a list of attribute values and their counts for answering set-valued queries, and those which provide direct access to tuples for answering attribute-valued queries. For the former, we evaluate the performance of linked lists, search trees, and hash tables as indexing structures, showing that the high costs of maintaining such structures over rapidly changing data are offset by the savings in query processing costs. For the latter, we propose novel ways of maintaining windowed ring indices, which we show to be much faster than conventional ring indices and more efficient than executing windowed queries without an index. | |||
Date | September 2003 | |||
Report | On Indexing Sliding Windows over On-line Data Streams (PDF) |
CS-2003-30 | ||||
---|---|---|---|---|
Title | Symbolic Representation and Retrieval of Moving Object Trajectories | |||
Authors | Lei Chen, Vincent Oria, and M. Tamer Ozsu | |||
Abstract | Similarity-based retrieval of moving object trajectory is useful to many applications, such as GPS system, sport and surveillance video analysis. However, due to sensor failures, errors in detection techniques, or different sampling rates, noises, local shifts and scales may appear in the trajectory records. Hence, it is difficult to design a robust and fast similarity measure for similarity-based retrieval in a large database. In this paper, we propose a normalized edit distance (NED) to measure the similarity between two trajectories. Compared to Euclidean distance, Dynamic Time Warping (DTW), and Longest Common Subsequences (LCSS), NED is more robust and accurate for trajectories contain noise and local time shifting. In order to improve the retrieval efficiency, we further convert the trajectories into a symbolic representation, called movement pattern strings, which encode both the movement direction and the movement distance information of the trajectories. The distances that are computed in a symbolic space are lower bounds of the distances of original trajectory data, which guarantees that no false dismissals will be introduced using movement pattern strings to retrieve trajectories. Furthermore, we define a modified frequency distance for frequency vectors that are obtained from movement pattern strings to reduce the dimensionality of movement pattern strings and computation cost of NED. The tests that we conducted indicate that the normalized edit distance is effective and the matching techniques are efficient. | |||
Date | September 2003 | |||
Report | Symbolic Representation and Retrieval of Moving Object Trajectories (PDF) |
CS-2003-33 | ||||
---|---|---|---|---|
Title | Generalized Infinite Undo and Speculative User Interfaces | |||
Authors | Alexjandro Lopez-Ortiz | |||
Abstract | We study the potential benefits of a computations model supporting an unlimited number of undo-like operations. We argue that, when implemented to its full generality this result s in a novel mode of human-computer interaction in which the user session is not linear in time. The model proposed is also more resilient in the presence of user or computer error. This resiliency can be exploited by a speculative user interface (SUI) which guesses user intentions. If the prediction is incorrect, the generalized infinite undo facility provides the ability to extemporaneously roll back the misprediction. We observe that this model is only now becoming a realistic possibility due to the current surplus of available CPU cycles on a modern desktop. | |||
Date | September 2003 | |||
Report | Generalized Infinite Undo and Speculative User Interfaces (PS) | Generalized Infinite Undo and Speculative User Interfaces (PDF) |
CS-2003-34 | ||||
---|---|---|---|---|
Title | Exploiting Statistics of Web Traces to Improve Caching Algorithms | |||
Authors | Alexander Golynski, Alejandro L´opez-Ortiz and Ray Sweidan | |||
Abstract | Web caching plays an important role in reducing network traffic, user delays, and server load. One important factor in describingthe stream of requests made to a given server is the popularity oflinks between accessed pages. For example, the probability of requesting page A increases if a user makes a request to a page which contains a link to page A. Furthermore, this probability depends on the amount of time elapsed since the request was made to the page containing the link and on popularity of the link. In this project we (1) analyze web access logs and determine the frequency distribution of access and mean life expectancy of correlations, (2) propose two new cache replacement policies that use the above distribution, and (3) evaluate the effectiveness of the new policies by comparing them to widely-used algorithms such as Greedy Dual-Size (GDS) and Greedy Dual-Frequency (GDSF). | |||
Date | October 2003 | |||
Report | Exploiting Statistics of Web Traces to Improve Caching Algorithms (PS) | Exploiting Statistics of Web Traces to Improve Caching Algorithms (PDF) |
Compressed
PostScript: Exploiting Statistics of Web Traces to Improve Caching Algorithms (GZIP) |
CS-2003-36 | ||||
---|---|---|---|---|
Title | Using Word Position in Documents for Topic Characterization | |||
Authors | Authors: Reem K. Al-Halimi, Frank W. Tompa | |||
Abstract | In this report we show how to use the position of words in a text for characterizing the topic of the text, and compare our method to measures that use frequency statistics that are independent of word position. We show that word position information produces words that are more suited for characterizing topics and at the same time relies on a vocabulary size that is as little as 10\%$ of the size used by the other measures. | |||
Date | October 2003 | |||
Report | Using Word Position in Documents for Topic Characterization (PDF) |
CS-2003-39 | ||||
---|---|---|---|---|
Title | Enforcing Domain Consistency on the extended Global Cardinality Constraint is NP- Hard | |||
Authors | Claude-Guy Quimper | |||
Abstract | In a global cardinality constraint, there is a set of variables X = {x_1, ... x_n} and a set of values D. Each variable x_i is associated to a domain dom(x_i) contained in D and each value v in D is associated to a cardinality set K(v). An assignment satisfies the extended global cardinality constraint (extended-GCC) if each variable x_i is instantiated to a value in its domain dom(x_i) and if each value v in D is assigned to k variables for some k in K(v). Extended-GCC differs from normal GCC in the fact that its sets of cardinality K(v) can be any set of values. In normal GCC these cardinality sets are restricted to intervals. We prove that enforcing domain consistency on the extended-GCC is NP-hard. | |||
Date | November 2003 | |||
Report | Enforcing Domain Consistency on the extended Global Cardinality Constraint is NP- Hard (PS) | Enforcing Domain Consistency on the extended Global Cardinality Constraint is NP- Hard (PDF) |
CS-2003-41 | ||||
---|---|---|---|---|
Title | Call Admission Control for Voice/Data Integration in Broadband Wireless Networks | |||
Authors | Majid Ghaderi and Raouf Boutaba | |||
Abstract | This paper addresses bandwidth allocation for an integrated voice/data broadband mobile wireless network. Specifically, we propose a new admission control scheme called EFGC, which is an extension of the well-known fractional guard channel scheme proposed for cellular networks supporting voice traffic. The main idea is to use two acceptance ratios, one for voice calls and the other for data calls in order to maintain the proportional service quality for voice and data traffic while guaranteeing a target handoff failure probability for voice calls. We describe two variations of the proposed scheme: EFGC-REST, a conservative approach which aims at preserving the proportional service quality by sacrificing the bandwidth utilization; and EFGC-UTIL, a greedy approach which achieves higher bandwidth utilization at the expense of increasing the handoff failure probability for voice calls. Extensive simulation results show that our schemes satisfy the hard constraints on handoff failure probability and service differentiation while maintaining a high bandwidth utilization. | |||
Date | November 2003 | |||
Report | Call Admission Control for Voice/Data Integration in Broadband Wireless Networks (PS) | Call Admission Control for Voice/Data Integration in Broadband Wireless Networks (PDF) |
Compressed
PostScript: Call Admission Control for Voice/Data Integration in Broadband Wireless Networks (GZIP) |
CS-2003-45 | ||||
---|---|---|---|---|
Title | Revisiting the Foundations of Subsurface Scattering | |||
Authors | Gladimir V. G. Baranoski, Aravind Krishnaswamy and Bradley Kimmel | |||
Abstract | Despite the significant advances in rendering, we are far from being able to automatically generate realistic and predictable images of organic materials such as plant and human tissues. Creating convincing pictures of these materials is usually accomplished by carefully adjusting rendering parameters. A key issue in this context is the simulation of subsurface scattering. Current algorithmic models usually rely on scattering approximations based on the use of phase functions, notably the Henyey-Greenstein phase function and its variations, which were not derived from biophysical principles of any organic material, and whose parameters have no biological meaning. In this report, we challenge the validity of this approach for organic materials. Initially, we present an original chronology of the use of these phase functions in tissue optics simulations, which highlights the pitfalls of this approach. We then demonstrate that a significant step toward predictive subsurface scattering simulations can be given by replacing it by a more efficient and accurate data oriented approach. Our investigation is supported by comparisons involving the original measured data that motivated the application of phase functions in tissue subsurface scattering simulations. We hope that the results of our investigation will help strengthen the biophysical basis required for the predictive rendering of organic materials. | |||
Date | December 2003 | |||
Report | Revisiting the Foundations of Subsurface Scattering (PDF) |