2007 Technical Reports | SCS | UW
[Please remove <h1>]
CS-2007-01 |
Title |
Reputation-Oriented Reinforcement Learning
Strategies for Economically-Motivated Agents in
Electronic Market Environments |
Authors |
Thomas Thanh Tran |
Abstract |
In this thesis, we propose a market model and learning algorithms
for buying and
selling agents in electronic marketplaces. We take into account the fact
that multi-
ple selling agents may offer the same good with different qualities,
and that selling
agents may alter the quality of their goods. We also consider the possible
exis-
tence of dishonest selling agents in the market. In our approach, buying
agents
learn to maximize their expected value of goods using reinforcement learning.
In
addition, they model and exploit the reputation of selling agents to
avoid inter-
action with the disreputable ones, and therefore to reduce the risk of
purchasing
low value goods. Our selling agents learn to maximize their expected
profits by
using reinforcement learning to adjust product prices, and also by altering
product
quality to provide more customized value to their goods. We experimentally
eval-
uate our model on both microscopic and macroscopic levels. On the micro
level,
we examine the individual benefit of agents, in particular their level
of satisfaction.
Our experimental results confirm that in both modest and large-sized
marketplaces,
buying and selling agents following our proposed algorithms achieve better
satis-
faction than buying and selling agents who only use reinforcement learning.
On the
macro level, we study how a marketplace populated with our buying and
selling
agents would behave as a whole. Our results show that such a marketplace
can
reach an equilibrium state where the agent population remains stable
and that this
equilibrium is beneficial for the participant agents. The market model
and learn-
ing algorithms presented in this thesis can therefore be used in designing
desirable
market environments and effective economically-motivated agents for e-commerce
applications.
|
Date |
January 19, 2007
|
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2007-02 |
Title |
Optimal Top-Down Join Enumeration (extended version) |
Authors |
David DeHaan, dedehaan@uwaterloo.ca
Frank Wm. Tompa, fwtompa@uwaterloo.ca |
Abstract |
Most contemporary database systems perform cost-based join
enumeration
using some variant of System-R's bottom-up dynamic programming method.
The notable exceptions are systems based on the top-down
transformational search of Volcano/Cascades.
As recent work has demonstrated, bottom-up dynamic programming
can attain optimality with respect to the shape of the join graph;
no comparable results have been published for transformational search.
However, transformational systems leverage benefits
of top-down search not available to bottom-up methods.
In this paper we describe a top-down join enumeration algorithm
that is optimal with respect to the join graph.
We present performance results demonstrating that
a combination of optimal enumeration with search strategies
such as branch-and-bound yields an algorithm significantly
faster than those previously described in the literature.
Although our algorithm enumerates the search space top-down,
it does not rely on transformations and
thus retains much of the architecture of traditional dynamic programming.
As such, this work provides a migration path for existing
bottom-up optimizers to exploit top-down search
without drastically changing to the transformational paradigm.
|
Date |
March 14, 2007
|
Comments |
This is an extended version of work published in
SIGMOD'07, June 11-14, 2007, Beijing, China.
That work is copyright ACM, 2007. |
Report |
|
|
Adobe PDF |
|
CS-2007-03 |
Title |
Predictable Semiautomata |
Authors |
Janusz Brzozowski and Nicolae Santean |
Abstract |
We introduce a new class of nondeterministic semiautomata:
A nondeterministic semiautomaton S is predictable if there
exists a positive integer k such that, if S knows the
present input a and the next k inputs, then the transition
under a is deterministic. Nondeterminism may occur only
when the length of the unread input is less than k+1. We
develop a comprehensive theory of predictable semiautomata.
Using a novel semiautomaton, called the core, we present
a test for predictability. We then introduce the predictor
semiautomaton, based on a look-ahead semiautomaton, that
is essentially deterministic. We describe two ways of
using the predictor to simulate a nondeterministic
semiautomaton. The first simulation predicts the set of
states reachable by every prefix of the input word as long
as the prefix is in the language of the semiautomaton.
The second simulation is similar, but it stops as soon as
it infers that the input word is not in the language of
the semiautomaton. Moreover, the membership of a word in
the language of a semiautomaton can be decided completely
deterministically. Finally, we show that, if a semiautomaton
with n states over a one-letter alphabet is k-predictable,
k being the smallest such integer, then k is at most n-1,
and this bound can be reached. For semiautomata over
arbitrary alphabets, k is at most (n2-n)/2, and
this bound can be reached for a suitable input alphabet.
Keywords: Automaton, delegator, look-ahead, nondeterminism,
predictor, selector, semiautomaton, simulation
|
Date |
May 9, 2007
|
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2007-04 |
Title |
Software System Generation from an Enterprise Service Model |
Authors |
D.D. Cowan, P.S.C. Alencar, D.B. Brown, H.D. Covvey, I. Gimenes,
C.J.P Lucena
Bill Malyk, D.W. Mulholland, A. Robins, K. Young
University of Waterloo |
Abstract |
This paper introduces an approach to constructing information
systems based on
enterprise service architectures (ESAs). An ESA model is specified using
declarative
business process modeling or workflow commands in conjunction with service
framework instances where the business process model and the instances
are captured in
a data structure. A service framework is an outline of different types
such as reports,
databases, agents, diagrams and maps and an instance is produced by completing
a
framework's adaptation points. Adaptation points are defined by an adaptation
interface
specified in a declarative language and presented through forms. The
business process or
workflow is specified similarly. An automatic transformation is then
applied to convert
the model into an operational information system.
This approach can be viewed as model-driven software development without
the
necessity of writing code to complete the implementation. Thus, we avoid
the symptoms
of architectural drift that occur when applications evolve independently
from their model.
The control structure is declarative and separate, clearly delineating
the concerns, thus
making it easier to address future anticipated, unanticipated and crosscutting
concerns.
We have created tools to support this approach, and although many of
the tools are
general, we have concentrated our efforts on the Web because of its pervasive
nature.
Thus, we have produced the Web-based Informatics Development Environment
(WIDE)
technology, a set of tools and processes used to transform a model based
on ESA
principles into a web-based information system. We have designed and
implemented over
30 operational web-based systems to validate our approach.
|
Date |
May 1 , 2007
|
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2007-05 |
Title |
OOMatch: Pattern Matching as Dispatch in Java |
Authors |
Adam Richard and Ondrej Lhotak |
Abstract |
We present a new language feature, specified as an extension
to Java. The feature is a form of dispatch which includes and
subsumes multimethods, but which
is not as powerful as general predicate dispatch. It is,
however, intended to be more practical and easier to use than
the latter. The extension, dubbed OOMatch, allows method
parameters to be specified as patterns, which are matched
against the arguments to the method call. When matches
occur, the method applies; if multiple methods apply, the
method with the more specific pattern overrides the others.
The pattern matching is very similar to that found in the "case" constructs of many functional languages (ML,
for example), with an important difference: functional languages
normally allow pattern matching over variant types
(and other primitives such as tuples), while OOMatch allows
pattern matching on Java objects. Indeed, the wider
goal here is the study of the combination of functional and
object-oriented programming paradigms. Of special importance,
we ensure that matching can occur while retaining the
complete control of class designers to prevent implementation
details (such as private variables) from being exposed to
clients of the class.
We here present both an informal "tutorial" description
of
OOMatch, as well as a formal specification of the language,
and a proof that the conditions specified guarantee run-time
safety.
|
Date |
March 19, 2007
|
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2007-06 |
Title |
Efficient Search for the Top-k Probable Nearest Neighbors
in Uncertain Databases |
Authors |
George Beskales, Mohamed A. Soliman and Ihab F. Ilyas |
Abstract |
Uncertainty pervades many domains in our lives. Current
real-life applications, e.g., location tracking using GPS devices or cell
phones, multimedia feature extraction, and sensor data management, deal
with different kinds of uncertainty. Finding the nearest neighbor objects
to a given query point is an important query type in these applications.
In this paper, we study the problem of finding objects with the highest
marginal probability of being the nearest neighbors to a query object. We
adopt a general uncertainty model allowing for data and query uncertainty.
Under this model, we define new query semantics, and provide several efficient
evaluation algorithms. We analyze the cost factors involved in query evaluation,
and present novel techniques to address the trade-offs among these factors.
We give multiple extensions to our techniques including handling dependencies
among data objects, and answering threshold queries. We conduct an extensive
experimental study to evaluate our techniques on both real and synthetic
data.
|
Date |
October 20, 2008
|
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2007-07 |
Title |
Bounded Model Checking with Description Logic
Reasoning |
Authors |
Shoham Ben-David, Richard Trefler, Grant Weddell |
Abstract |
Model checking is a technique for verifying that a finite-state
concur-
rent system is correct with respect to its specification. In bounded
model check-
ing (BMC), the system is unfolded until a given depth, and translated
into a CNF
formula. A SAT solver is then applied to the CNF formula, to find a satisfying
assignment. Such a satisfying assignment, if found, demonstrates an error
in the
model of the concurrent system.
Description Logic (DL) is a family of knowledge representation formalisms,
for
which reasoning is based on tableaux techniques. We show how Description
Logic can serve as a natural setting for representing and solving a BMC
prob-
lem. We formulate a bounded model checking problem as a consistency problem
in the DL dialect
ALC I . Our formulation results in a compact representation of
the model, one that is linear in the size of the model description, and
does not
involve any unfolding of the model. Experimental results, using the DL
reasoner
FaCT++, significantly improve on a previous approach that used DL reasoning
for model checking.
|
Date |
March 26, 2007
|
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2007-08 |
Title |
Designing Succinct Structural Alphabets |
Authors |
Shuai Cheng Li , Jinbo Xu, Xin Gao, Dongbo Bu,
Ming Li |
Abstract |
Abstract: The 3D structure of protein sequence A can be
assembled by the
substructures corresponding to small segments of A. A sequence
segment does not take on all the structural fragments and thus it
is desirable to build a short customized structural candidate list
for each sequence segment. For each sequence segment, these
substructures are its "specific structural alphabet". The
smaller these candidate lists are, the faster the protein
structure can be constructed; the more accurate these candidate
lists are, the more accurate the final protein structure will be.
A major obstacle in protein structure prediction is to construct a
small set of structural candidates for each segment such that the
native structure can be rebuilt from these structural candidates
accurately.
Based on integer linear
programming and incorporating extra structural information,
a software package FragShaver is developed. We have made
significant progress in overcoming the abovementioned obstacle:
1. Comparing our package to the Rosetta's fragment selection
method, at threshold 1A and structural fragment length 9, the
position coverage is improved from 56.4% to 79.1% for
$\beta$-sheet, and from 55.5% to 67.9% for loop, while reducing
the candidate list size from 25 to 10 simultaneously. By using
candidate list size 25, our method improves Rosetta's position
coverage of $\beta$-sheet from 56.5% to 89.6% and the position
coverage of loop from 55.5% to 78.1%. At 1.5A, we achieve
position coverage 96.7%.
2 Applying our method to Kolodny's
independent library, our experiment indicates that our method is
capable of identifying a small subset of structural candidates for
a given sequence segment to achieve the same accuracry as using
the whole library, reducing Kolodny's library size from 200 to 25
at the same accuracy.
|
Date |
April 4, 2007
|
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2007-09 |
Title |
The FireCollaborator: a Collaborative Approach
for Attack Detection |
Authors |
Jerome Francois, Adel El-Atawy, Ehab Al-Shaer, and Raouf
Boutaba |
Abstract |
The distributed denial of service attacks are a ma jor
threat
on Internet and detecting this kind of attack as far as possible from
the victim in order to save resources is a real challenge. We propose
a new framework to deal with this problem which is based on Intrusion
Prevention System (IPS) deployed on a Internet Service Provider (ISP)
level. The key point is to use compressed metrics based on the routing
rules in order to extract suspect traffics thanks to a collaboration
between routers on the same path and finally confirm an attack by
using the
IPS on different path but on the same level (number of hops before the
potential victim). The main metric we use is the frequency of a rule
but we decide to use the entropy too. We are able to detect a change in
the traffic and define what rules changes a lot and have a too higher frequency.
After the share of the different belief of each IPS is needed to
determine the very potential attack before confirm it by computing the
packets rate. There are three main advantages of our proposition. The
first is that because you select the rules, you can analyze precisely
these rules and be sure that there is an attack. The second is that we
save a lot of resources (network, CPU and storage) thanks to the choice
of the metrics and
the selection of rule. Finally because we determine the attack as far
as
possible the final host can save resources too because the attack traffic
is blocked early and congestion can be avoided.
|
Date |
April 16, 2007
|
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2007-10 |
Title |
Addressing an Open Problem on Regex |
Authors |
Cezar Campeanu and Nicolae Santean |
Abstract |
In this paper we revisit the semantics of extended regular
expressions
(regex), defined succinctly in the 90's (A.V. Aho) and rigorously in
2003
by Campeanu, Salomaa and Yu, when the authors reported an open
problem, namely whether regex languages are closed under the intersection
with regular languages. We give a positive answer to this question; and
for doing so, we propose a new class of machines - regex automata systems
(RAS) - which are equivalent to regex. Among others, these machines
provide a consistent and convenient method of implementing regex in
practice. We also prove, as a consequence of this closure property, that
several anthological languages, such as the mirror language, the language
of palindromes and the language of balanced words are not regex languages.
Keywords: Extended Regular Expression, Regex Automata System, Regex
|
Date |
April 11, 2007
|
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2007-11 |
Title |
Succinct Representation of Labeled Graphs |
Authors |
Jeremy Barbay, Luca Castelli Aleardi, Meng He and J. Ian
Munro |
Abstract |
In many applications, properties of an object being modeled
are stored as labels on vertices or edges of a graph. In this paper, we
consider succinct representation of labeled graphs. Our main results are
the succinct representations of labeled and multi-labeled graphs (we consider
vertex labeled planar triangulations, as well as edge labeled planar graphs
and the more general $k$-book embedded graphs) to support various label
queries efficiently. The additional space cost to store the labels is essentially
the information-theoretic minimum. As far as we know, our representations
are the first succinct representations of labeled graphs.
We also have two preliminary results to achieve the main results. First,
we design a succinct representation of unlabeled planar triangulations
to support the rank/select of edges in ccw (counter clockwise) order
in addition to the other operations supported in previous work. Second,
we design a succinct representation for a $k$-book graph when $k$ is
large to support various navigational operations more efficiently. In
particular, we can test the adjacency of two vertices in $O(\lg k\lg\lg
k)$ time, while previous work uses $O(k)$ time.
|
Date |
May 2, 2007
|
Comments |
|
Report |
|
PostScript |
Adobe PDF |
|
CS-2007-12 |
Title |
Implementation of Parallel Set
Intersection for Keyword Search using RapidMind |
Authors |
Fady Samuel, Jeremy Barbay and Michael McCool |
Abstract |
The intersection of large ordered sets is a common problem
in the
context of the evaluation of boolean queries to a search engine. In
this paper we propose several parallel algorithms for computing the
intersection of sorted arrays, taking advantage of the parallelism
provided by the new generation of multicore processors and of the
programming library Rapidmind. We perform an experimental
comparison of performance on a excerpt of web index and queries
provided by Google, and show an improvement of performance
proportional to the parallelism when using up to four processors.
Our results confirm the intuition that the intersection problem is
implicitly parallelisable at the set level, and not only at the
query level as previously considered.
|
Date |
May 8, 2007
|
Comments |
|
Report |
|
PostScript |
Adobe PDF |
|
CS-2007-13 |
Title |
Faster Set Intersection Algorithms
for Text Searching |
Authors |
Jeremy Barbay, Alejandro Lopez-Ortiz, Tyler Lu and Alejandro
Salinger. |
Abstract |
The intersection of large ordered sets is a common problem
in the
context of the evaluation of boolean queries to a search engine. In
this paper we propose several improved
algorithms for computing the intersection of sorted arrays, and in
particular for searching sorted arrays in the intersection context.
We perform an experimental comparison with the algorithms from
the previous studies from Demaine, López-Ortiz and Munro [ALENEX 2001],
and from Baeza-Yates and Salinger [SPIRE 2005]; in addition, we implement and
test the intersection algorithm from Barbay and Kenyon [SODA 2002] and its randomized
variant [SAGA 2003].
We consider both the random data-set from Baeza-Yates and Salinger,
the Google queries used by Demaine et al., a corpus provided by
Google and a larger corpus from the TREC Terabyte 2006 efficiency
query stream, along with its own query log. We measure the
performance both in terms of the number of comparisons and searches
performed, and in terms of the CPU time on two different
architectures. Our results confirm or improve the results from both
previous studies in their respective context (comparison model on
real data and CPU measures on random data), and extend them to new
contexts. In particular we show that value-based search algorithms
perform well in posting lists in terms of the number of
comparisons performed.
|
Date |
April 30, 2007
|
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2007-14 |
Title |
An Approximation Algorithm for Shortest Descending Paths |
Authors |
Mustaq Ahmed and Anna Lubiw |
Abstract |
A path from s to t on a polyhedral terrain is descending
if the
height of a point p never increases while we move p along the
path from s to t. No efficient algorithm is known to find a
shortest descending path (SDP) from s to t in a polyhedral
terrain. We give a simple approximation algorithm that solves the
SDP problem on general terrains. Our algorithm discretizes the
terrain with O(n2 X / e) Steiner points so that after an
O(n2 X / e * log(n X /e))-time preprocessing phase for a given
vertex s, we can determine a (1+e)-approximate SDP from s to any
point v in O(n) time if v is either a vertex of the terrain or a
Steiner point, and in O( n X /e ) time otherwise. Here n is the
size of the terrain, and X is a parameter of the geometry of the
terrain.
|
Date |
May 9, 2007
|
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2007-15 |
Title |
Reasoning about Interaction in Mixed-Initiative Artificial
Intelligence Systems |
Authors |
Michael Fleming
|
Abstract |
This thesis presents computational models for the design
of mixed-initiative artificial intelligence systems that can make rational
decisions about interaction with potentially helpful users. Mixed-initiative
systems are ones in which either the system or the user can take the initiative
to direct the dialogue or the problem solving. These systems have been designed
for such diverse applications as robotics, military planning, intelligent
tutoring and trip scheduling. One challenge in designing these systems is
to specify when the system should take the initiative to interact with the
user. The main contribution of the thesis is to provide designers of mixed-initiative
systems with a systematic approach for constructing systems that can reason
in a principled way about interaction with the user, regardless of the area
of application. Our approach is to model the user, the task and the dialogue
simultaneously. Specific factors are proposed that must be modeled, and
methods are developed for how to combine these factors in order to make
rational decisions about interaction, based on whether the perceived benefits
of communication exceed the expected costs. Some examples and experiments
are described, to demonstrate the value of the models and to justify decisions
that were made in determining the role of each factor in the computational
models. In particular, we emphasize the value of making decisions about
interaction based on a careful evaluation of the needs, preferences and
abilities of the user, leading to mixed-initiative systems that are user-specific
and therefore result in greater overall user satisfaction.
|
Date |
December 18, 2007
|
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2007-16 |
Title |
On Ordering Descriptions in a Description Logic |
Authors |
Jeffrey Pound, Lubomir Stanchev, David Toman, and Grant Weddell |
Abstract |
We introduce a description language for specifying
partial ordering relations over concept descriptions in description
logics, and show how the language can be used in combination with
binary trees to effiently search a database that corresponds to a
finite set of concept descriptions. The language consists of a pair
of ordering constructors that support a form of exogenous indexing
in which search criteria is independent of data, and a form of
endogenous indexing in which the data itself provides search
criteria. Our language can be refined in the same way as a
description logic in that greater expressiveness and conse- quent
richer search capability is achieved by adding additional ordering
constructors.
|
Date |
June 7, 2007
|
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2007-17 |
Title |
Geometric Streaming Algorithms with a Sorting Primitive |
Author |
Eric Y. Chen |
Abstract |
We solve several fundamental geometric problems under a new
streaming model recently
proposed by Ruhl et al. [2, 12]. In this model, in one pass the input
stream can be scanned to generate
an output stream or be sorted based on a user-defined comparator; all
intermediate streams must be
of size O(n). We obtain the following geometric results for any fixed
constant
|
Date |
May 23, 2007
|
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2007-18 |
Title |
Comprehending Object-Oriented Software Frameworks API Through
Dynamic Analysis |
Authors |
Abbas Heydarnoori, Thiago Tonelli Bartolomei and Krzysztof
Czarnecki |
Abstract |
A common practice followed by many application developers
is to use existing framework applications as a guide to understand how to
implement a framework-provided concept of interest. Unfortunately, finding
the code that implements the concept of interest might be very difficult
since the code might be scattered across and tangled with code implementing
other concepts. To address this issue, this report presents an approach
called FUDA (Framework API Understanding through Dynamic Analysis). The
main idea of this approach is to extract implementation recipes of a given
concept from runtime information collected when that concept is invoked
in a number of different sample applications. For this purpose, we introduce
a novel dynamic slicing approach named 'concept trace slicing' and combine
it with clustering and data mining techniques. The experimental evaluation
of FUDA suggests that this approach is effective in producing useful implementation
recipes for a given concept with few false positives and false negatives.
|
Date |
March 10, 2008
|
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2007-19 |
Title |
CIME: some protocols for Computer In the Middle Education |
Authors |
Jeremy Barbay |
Abstract |
We introduce Computer In the Middle Education (CIME), a family
of
software providing access to some educational data in exchange of
further addition to this data. We present several prototypes:
VocaCIME, focused on gathering and teaching foreign vocabulary
associated to images and CIMEQuiz, focused on multi-choice questions
on simple mathematical concepts. In particular, we describe
techniques for the quality control of the data submitted by the
user, and techniques to insure the growth of the database when it is
used. While this work is partially inspired by the human
computation games ESP, Peekaboom, Phetch and Verbosity from Luis von
Ahn and his collaborators, the approach illustrates new directions
of research, such as bootstrapping databases, where a small initial
database is grown into a larger one by its mere use, and computer in
the middle techniques, where the computer provides a communication
interface between users (rather than a game), but still recovers
useful data from it. This technical report describes prototypes
under development or yet to be developed.
|
Date |
June 12, 2007
|
Comments |
|
Report |
|
PostScript |
Adobe PDF |
|
CS-2007-20 |
Title |
Adaptive Planar Convex Hull Algorithm for a Set of Convex
Hulls |
Authors |
Jeremy Barbay and Eric Y. Chen |
Abstract |
An adaptive algorithm is one which performance can be
expressed more precisely than as a mere function of the size of the
input: output-sensitive algorithms are a special case of adaptive
algorithms. We consider the computation of the convex hull of a set
of convex hulls, for instance in the case where the set of points
has been composed from simpler objects from a library, for each of
which the convex hull has been precomputed. We show that in this
context an adaptive algorithm performs better if it takes advantage
of other features than the size of the output.
|
Date |
November 28, 2007
|
Comments |
|
Report |
|
PostScript |
Adobe PDF |
|
CS-2007-21 |
Title |
PSALM: Accurate Sampling for Cardinality Estimation in a
Multi-user Environment |
Authors |
Huaxin Zhang h7zhang@cs.uwaterloo.ca
Ihab F. Ilyas ilyas@cs.uwaterloo.ca
Kenneth Salem kmsalem@cs.uwaterloo.ca |
Abstract |
In database systems that support fine-grained access controls,
each
user has access rights that determine which tuples are accessible and
which are inaccessible. Queries are answered as if the inaccessible
tuples are not present in the database. Thus, users with different
access rights may get different answers to a given query. To process
queries efficiently in the presence of fine-grained access controls,
the database system needs accurate estimates of the number of tuples
that are both accessible according to the access rights of the
submitting user and relevant according to the selection predicates in
the query. In this paper we present sampling-based cardinality
estimation techniques for use in the presence of fine-grained access
controls. These techniques exploit the fact that access rights are
relatively static and are common to all queries that are evaluated on
behalf of a particular user. We show that the proposed techniques
provide more accurate estimates than simpler techniques that do not
exploit knowledge of access rights. We quantify these improvements
analytically and through simulations.
|
Date |
June 27, 2007
|
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2007-22 |
Title |
XML Index Recommendation with Tight Optimizer Coupling |
Authors |
Iman Elghandour,
Ashraf Aboulnaga,
Daniel C. Zilio,
Fei Chiang,
Andrey Balmin,
Kevin Beyer,
Calisto Zuzarte |
Abstract |
XML database systems are expected to handle increasingly
complex queries over increasingly large and highly structured XML
databases. An important problem that needs to be solved for these systems
is how to choose the best set of indexes for a given workload. In this
paper, we present an XML Index Advisor that solves this XML index recommendation
problem, and has the key characteristic of being tightly coupled
with the query optimizer. We rely on the optimizer
to enumerate candidate indexes and to estimate the benefit gained from
potential index configurations. We expand the set of candidate indexes
obtained from the query optimizer to include more general indexes that
can be useful for queries other than those in the training workload. To
recommended an index configuration, we introduce two new search algorithms.
The first algorithm finds the best set of indexes for the specific training
workload, and the second algorithm finds a general set of indexes that
can benefit the training workload as well as other similar workloads. We
have implemented our XML Index Advisor in a prototype version of IBM DB2
9, which supports both relational and XML data, and we experimentally demonstrate
the effectiveness of our advisor using this implementation.
|
Date |
July 11, 2007
|
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2007-23 |
Title |
Small poly-line drawings of series-parallel graphs |
Authors |
Therese Biedl |
Abstract |
In this paper, we study small planar drawings of planar graphs.
For
arbitrary planar graphs, $\theta(n2)$ is the established upper
and lower bound on the area. It is a long-standing
open problem for what graphs smaller area can be achieved.
We show here that series-parallel can be drawn in $O(n^{3/2})$ area
and outer-planar graphs can be drawn in $O(n\log n)$ area, but
partial 3-trees and 2-outer-planar graphs require $\Omega(n2)$ area.
Our drawings are visibility representations, which can be
converted to poly-line drawings of asymptotically the same area.
|
Date |
June 21, 2007
|
Comments |
|
Report |
|
PostScript |
Adobe PDF |
|
CS-2007-24 |
Title |
Improving Convergence Rates in Multiagent Learning Through
Experts and Adaptive Consultation |
Authors |
Greg Hines and Kate Larson |
Abstract |
We present a multiagent learning algorithm with guaranteed
convergence
to Nash equilibria for all games. Our approach is a regret-based
learning algorithm which combines a greedy random sampling method with
consultation of experts that suggest possible strategy profiles. More
importantly, by consulting carefully chosen experts we can greatly
improve the convergence rate to Nash equilibria, but in the case where
the experts do not return useful advice, we still have guarantees
that our algorithm will eventually converge. The goal of our work is
to bridge the gap between theoretical and practical learning, and we
argue that our approach, FRAME, can serve as a framework
for a class of multiagent learning algorithms.
|
Date |
July 26, 2007
|
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2007-25 |
Title |
Stronger lower bounds for text searching and polynomial evaluation |
Authors |
Alex Golinski |
Abstract |
In this paper, we give two main technical results:
(i) we show a stronger lower bound for substring search problem via compression
extending results of Demaine and L{\'o}pez-Ortiz (SODA '01);
(ii) improve
the results of Gal and Miltersen (ICALP '03) by showing a bound on the redundancy
needed by the polynomial evaluation problem that is linear in terms of
the information-theoretic minimum storage required by a polynomial.
|
Date |
December 20, 2007
|
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2007-26 |
Title |
WILL NOT BE SUBMITTED |
Authors |
Alex Golinski |
Abstract |
|
Date |
|
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2007-27 |
Title |
WILL NOT BE SUBMITTED |
Authors |
Alex Golinski |
Abstract |
|
Date |
|
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2007-28 |
Title |
Reconstructing Orthogonal Polyhedra from Putative Vertex
Sets |
Authors |
Therese Biedl and Burkay Genc |
Abstract |
In this report we study the problem of reconstructing
orthogonal polyhedra
from a putative vertex set, i.e., we are given a set of points and want
to find an orthogonal polyhedron for which this is the set of vertices.
We are especially interested in testing whether such a polyhedron
is unique. Since this is not the case in general, we focus on orthogonally
convex polyhedra, and give an $O(n log n)$ algorithm to find the answer.
We then consider the case where the given set of points may be rotated
beforehand. For 2D, we prove uniqueness and provide an $O(n log n)$
algorithm to obtain the answer, which can then be used for an
$O(n2 log n)$ algorithm in 3D.
|
Date |
August 8, 2007
|
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2007-29 |
Title |
Robustness in Automatic Physical Database Design |
Authors |
Kareem El Gebaly |
Abstract |
Automatic physical database design tools rely on ``what-if''
interfaces to the query optimizer to estimate the execution time of the
training query workload under different candidate physical designs. The
tools use these what-if interfaces to recommend physical designs that minimize
the estimated execution time of the input training workload. Minimizing
estimated execution time alone can lead to designs that are not robust
to query optimizer errors and workload changes. In particular, if the optimizer
makes errors in estimating the execution time of the workload queries,
then the recommended physical design may actually degrade the performance
of these queries. In this sense, the physical design is risky. Furthermore,
if the production queries are slightly different from the training queries,
the recommended physical design may not benefit them at all. In this sense,
the physical design is not general. We define Risk and Generality as two
new measures aimed at evaluating the robustness of a proposed physical
database design, and we show how to extend the objective function being
optimized by a generic physical design tool to take these measures into
account. We have implemented a physical design advisor in PostqreSQL, and
we use it to experimentally demonstrate the usefulness of our approach.
We show that our two new metrics result in physical designs that are more
robust, which means that the user can implement them with a higher degree
of confidence. This is particularly important as we move towards truly
zero-administration database systems in which there is not the possibility
for a DBA to vet the recommendations of the physical design tool before
applying them.
|
Date |
August 15, 2007
|
Comments |
M.Math thesis |
Report |
|
|
Adobe PDF |
|
CS-2007-30 |
Title |
Support for Collaborative Feature-Based Product Configuration
in Software Product Lines |
Authors |
Marcilio Mendonca and Donald Cowan |
Abstract |
In Software Product Lines (SPLs), product configuration
is a decision-making process in which a group of stakeholders indicate
the features desired for a particular product (software). A feature model
is normally used to represent the spectrum of available configuration decisions
and thus works as a guide to the configuration process. Although in practice
product configuration is seen as a collaborative activity that involves
satisfying stakeholders with divergent interests and skills, current configuration
technology is essentially single-user-based in which user requirements
are interpreted and translated into configuration decisions by a single
role referred to as the application engineer. As a consequence, product
configuration becomes time-consuming and inaccurate especially in the case
of large product lines. This technical report discusses a doctoral research
proposal on Collaborative Product Configuration (CPC). The research aims
at investigating the major challenges of realizing CPC in SPLs and, subsequently,
at developing an approach that explicitly addresses the problems identified.
CPC concepts, algorithms, and tool support are discussed and some preliminary
experimental results are shown. The research is expected to bring contributions
to the SPLs field by paving the way for a deeper understanding of collaborative
product configuration, and ultimately by fostering the development of newer
and better approaches in the future.
|
Date |
August 20, 2007
|
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2007-31 |
Title |
Categorization of Implicit Invocation Systems |
Authors |
Rolando Blanco and Paulo Alencar |
Abstract |
Development and maintenance of implicit invocation systems
is not as
well understood and supported as the development of explicit
invocation systems. The situation is aggravated in systems that allow
the composition of functionality developed by different organizations,
potentially using different programming languages and
methodologies. In this document, we categorize the various ways in
which implicit invocation systems can define, bind to, announce,
subscribe, and deliver events. We also categorize the architectures
and topologies of implicit invocation systems. The purpose of the
categorization is to increase the understanding of the type of
requirements that these systems impose on software development
practices. As the categories are introduced, documented representative
invocation systems are discussed in order to illustrate the types of
systems that fall under each given category. Our categorization applies
to systems as varied as active databases, aspect-oriented
applications, and distributed heterogeneous event-based systems.
|
Date |
August 22, 2007
|
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2007-32 |
Title |
A Fast Unified Optimal Route Query Evaluation Algorithm |
Authors |
Edward P.F. Chan and Jie Zhang |
Abstract |
We investigate the problem of how to evaluate, fast and
efficiently, classes of optimal route queries on a massive
graph in a unified framework. To evaluate a route query
effectively, a large network is partitioned into a collection
of fragments, and distances of some optimal routes in the
network are pre-computed. Under such a setting, we find
a unified algorithm that can evaluate classes of optimal
route queries. The classes that can be processed efficiently
are called constraint preserving (CP) which include,
among
others, shortest path, forbidden edges/nodes, α-autonomy,
and some of hypothetical weight changes optimal route
query classes. We prove the correctness of the unified algorithm.
We then turn our attention to the optimization of
the proposed algorithm. Several pruning and optimization
techniques are derived that minimize the search time and
I/O accesses. We show empirically that these techniques
are effective. The proposed optimal route query evaluation
algorithm, with all these techniques incorporated, is compared
with a main-memory and a disk-based brute-force
CP algorithms. We show experimentally that the proposed
unified algorithm outperforms the brute-force algorithms,
both in term of CPU time and I/O cost, by a wide margin.
|
Date |
September 6, 2007
|
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2007-33 |
Title |
Succinct Data Structures,
Adaptive (analysis of) Algorithms:
Overview, Combination, and Perspective |
Authors |
Jeremy Barbay |
Abstract |
Succinct data structures replace static instances of pointer
based
data structures, improving performance in both time and space in the
word RAM model (a restriction of the RAM model where the size of a
word is restricted). The adaptive analysis of algorithms consider
the complexity in a finer way than merely grouping the instances by
size, yielding more precise lower and upper bound on the complexity
of a problem. We give a quick overview of those two techniques,
some brief examples of how they can be combined on various search
problems to obtain near optimal solutions, and some general
perspective on the application and development of those techniques
to other problems and models.
|
Date |
September 13, 2007
|
Comments |
|
Report |
|
PostScript |
Adobe PDF |
|
CS-2007-34 |
Title |
Impress: Towards Next-Generation Ubiquitous and Pervasive
Computing Systems |
Authors |
James P. Black, Hao Chen, and Omar Zia Khan |
Abstract |
Research in pervasive and ubiquitous computing has resulted
in a number of prototype systems developed and implemented in isolation.
If this research area is to have impact, the next generation of research
prototypes must move beyond single laboratories and isolated, independent
software frameworks. Future systems must not only inter-work and inter-operate,
but should enable more rapid prototyping and incorporation of components
and software artifacts developed by others. Developing and exchanging software,
including the concepts embodied in it, requires an approach based on standards
and widely available software infrastructure.
The main contribution of the paper is to show that an appropriate
infrastructure for next-generation ubiquitous and pervasive computing
systems can be found in the Jabber open-source community. Jabber, now
standardized as XMPP, was originally conceived as an open-source alternative
to proprietary instant-messaging systems. Its design, architecture,
and implementation provide a rich, extensible, standardized selection
of concepts, protocols, servers, components, and clients. We show how
Jabber provides an excellent foundation for the implementation of smart
spaces, and a robust platform on which to build a ubiquitous-computing
ecology, while supporting ambitious research goals in second-generation
smart spaces. We also discuss various proof-of-concept prototypes that
we have developed to validate our argument. Finally, we demonstrate
how Jabber has allowed us to pursue research in pervasive computing
without requiring substantial effort in a start-up phase.
|
Date |
October 3 , 2007
|
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2007-35 |
Title |
Rapid and Accurate Protein Side Chain Prediction
Using Local Backbone Information Only |
Authors |
Jing Zhang, Xin Gao, Jinbo Xu, and Ming Li (The first two
authors
contribute equally to the paper.) |
Abstract |
High-accuracy protein structure modeling demands on accurate
and very
fast side chain prediction since
such a procedure must be repeatedly called at each step of structure
refinement.
Many known side chain prediction programs, such as SCWRL and TreePack,
depend on the philosophy that global information
and pairwise energy function must be used to achieve high accuracy.
These programs are too slow to be used in the case when side chain
packing has to be used thousands of times,
such as protein structure refinement and protein design.
We present an unexpected study that local backbone information can
determine side chain conformations accurately.
LocalPack, our side chain packing program which is based on only local
information, achieves equal accuracy as SCWRL and TreePack, yet runs
many times faster, hence providing a key
missing piece in our efforts to high-accuracy protein structure modeling.
|
Date |
October 3, 2007
|
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2007-36 |
Title |
Fast and Optimal Scheduling Over Multiple Network
Interfaces |
Authors |
Matei A. Zaharia and
Srinivasan Keshav |
Abstract |
Today's mobile phones already contain two or more network
interfaces (NICs)
and future devices are likely to have several more, each with different
energy costs,
dollar costs, and data transmission capacities.
Given a mobile device that is running multiple data-oriented
applications, such as email, instant messenger, and video download, some
of which may be delay-tolerant, it becomes necessary to assign, to each
unit of application data, the network interface and the time of transmission
that maximizes user satisfaction. This scheduling algorithm must take
into account not only user, application, and NIC characteristics, but
also future opportunistic availability of NICs due to device mobility.
In this paper, we begin by showing that naive scheduling approaches
can be far from optimal.
We then show that although optimal schedules can be found using either
linear programming or network flow, these classical approaches are
infeasible in CPU- and memory-constrained mobile devices.
This motivates our novel hill-climbing approach that exploits problem
structure to
optimally schedule application messages over multiple NICs assuming
perfect
knowledge of future connectivity.
We prove mathematically that our algorithm is correct.
Detailed performance measurements show that our algorithm finds optimal
solutions 10-100 times faster than the simplex method and network flow.
We also describe an implementation of our solution in a J2ME-based scheduler
that runs on laptops, smartphones and PDAs, demonstrating its effectiveness
in a realistic setting.
|
Date |
October 23, 2007
|
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2007-37 |
Title |
Recommender Schemes for Peer-to-Peer Systems |
Authors |
Loubna Mekouar, Youssef Iraqi, and Raouf Boutaba |
Abstract |
In Peer-to-Peer (P2P) file sharing systems, peers spend
a
significant amount of time looking for relevant and interesting
files. However, the files available for download represent on one
hand a rich collection for different needs and preferences and on
the other hand a struggle for the peers to find files that they
like. In this paper, we propose new recommender schemes based on collaborative
filtering. Peers collaborate to filter out
irrelevant files. These schemes help peers find and discover new,
interesting and relevant files. We propose recommender schemes based
on Files' Popularity and/or Peers' Similarity. To
overcome the problems of traditional collaborative filtering
recommender systems, an implicit rating approach is used. Simulation
results confirm the effectiveness of Peers' Similarity based
Recommendation in providing accurate recommendations. In addition,
the proposed recommender schemes are proactive. Recommendations are
provided to peers to motivate them to download the recommended
files.
|
Date |
November 20, 2007
|
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2007-38 |
Title |
Experimental Evaluation of List Update Algorithms for Data
Compression |
Authors |
Reza Dorrigiv, Alejandro Lopez-Ortiz, and J. Ian Munro |
Abstract |
List update algorithms have been widely used as subroutines
in
compression schemas, specially after the introduction of the
Burrows-Wheeler transform.
The Burrows-Wheeler transform (BWT), which is the basis of most
state-of-the-art general purpose compressors, defines a permutation of
the
text with increased apparent regularity. Then it applies a compression
algorithm on the permuted version of the original text.
List update algorithms are a common choice for this second stage of
BWT-based compression.
As well, list update algorithms have been shown to be a reasonable
alternative to simple compression schemes such as Huffman coding
\cite{bentley}.
In this paper we perform an experimental comparison of various list
update algorithms both as stand alone compression mechanisms and as a
second stage of the BWT-based compression.
Our experiments show that for straightforward compression competitive
list update algorithms such as MTF while simpler, are an inferior
choice to Huffman based compression
on, in contrast they are a good choice as a second stage in BWT.
Then we address the question of which list update algorithm is best
for this second stage.
We show that MTF outperforms other list update algorithms in practice
after BWT.
This is consistent with the intuition that BWT increases locality of
reference and the predicted result from the locality of reference
model of Angelopoulos et al.~\cite{bijective}.
Lastly, we show theoretically that due to an often neglected
difference in the cost models, good list update algorithms may be far
from optimal for BWT compression.
|
Date |
October 22, 2007
|
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2007-39 |
Title |
The Cooperative Ratio of On-line Algorithms |
Authors |
Reza Dorrigiv and Alejandro Lopez-Ortiz |
Abstract |
On-line algorithms are usually analyzed using competitive
analysis, in
which the performance of an on-line algorithm on a sequence is
normalized by the performance of the optimal off-line algorithm on
that sequence. In this paper we introduce cooperative analysis as an
alternative general framework for the analysis of on-line algorithms.
The idea is to normalize the performance of an on-line algorithm by a
measure other than the performance of the off-line optimal algorithm
OPT. We show that in many instances the perform of OPT on a sequence
is a coarse approximation of the difficulty or complexity of a given
input. Using a finer, more natural measure we can separate paging and
list update algorithms which were otherwise indistinguishable under
the classical model. This creates a performance hierarchy of
algorithms which better reflects the intuitive relative strengths
between them. Lastly, we show that, surprisingly, certain randomized
algorithms which are superior to MTF in the classical model are not so
in the cooperative case, which matches experimental results. This
confirms that the ability of the on-line cooperative algorithm to
ignore pathological worst cases can lead to algorithms that are more
efficient in practice.
|
Date |
October 22, 2007
|
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2007-40 |
Title |
Design and Implementation of the KioskNet System
(Extended Version)* |
Authors |
S. Guo, M.H. Falaki, U. Ismail, E.A. Oliver, S. Ur Rahman,
A. Seth, M.A. Zaharia, and S. Keshav
David R. Cheriton School of Computer Science
University of Waterloo
Waterloo, Ontario N2L 3G1
{sguo, mhfalaki, uismail, eaoliver, surrahman, a3seth, mazahari, keshav}@uwaterloo.ca |
Abstract |
Rural Internet kiosks in developing regions can
cost-effectively provide communication and information services
to the poorest sections of society.
Yet, a variety of technical and non-technical issues have caused most
kiosk deployments to be economically
unsustainable.
KioskNet addresses the key technical problems underlying kiosk failure
by using robust
`mechanical backhaul' for connectivity, and by using
low-cost and reliable kiosk-controllers to support services delivered from one
or more recycled PCs.
KioskNet also addresses related issues such as
security, user management, and log collection.
In this paper, we describe the KioskNet system, outlining its hardware, software,
and security architecture. We describe a pilot deployment,and how we used lessons
from this deployment to re-design our initial prototype.
|
Date |
November 23, 2007
|
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2007-41 |
Title |
Circular Arcs as Primitives for
Vector Textures |
Authors |
Zheng Qin, Craig S. Kaplan, and Michael D. McCool". |
Abstract |
Because of the resolution independent nature of vector
graphics images,
it is useful to consider using them directly as textures in 3D rendering.
Spline curves
are used to represent boundaries in standard vector graphics representations.
An-
tialiased rendering of such content can be obtained by soft thresholding
an implicit
representation of these boundary features. The distance function is an
especially
useful implicit representation, and an accurate distance can also be
used for special
effects such as embossing and stroke texturing. Unfortunately, computation
of the
true distance to a spline curve is expensive. Therefore, normally either
the distance
is approximated or the spline curves are approximated with simpler primitives
for
which an accurate distance can be computed. Approximation with line segments
gives only C 0 continuity. We approximate spline curves using circular
arcs instead. This approximation has C 1 continuity, and computing the
distance to a circular arc
is nearly equivalent in expense to computing the distance to a line segment.
|
Date |
December 13, 2007
|
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2007-42 |
Title |
Design and Implementation of an SMS Control Channel for Mobile
Systems |
Authors |
Earl Oliver |
Abstract |
The Short Message Service (SMS) is one of the most ubiquitous
wireless
technologies on Earth. Each year hundreds of billions of messages are
sent,
demand continues to grow, and competition between cellular providers
is
driving prices down. These trends create practical opportunities for
the
use of SMS in mobile systems. In this paper we present the design and
implementation of an SMS-based data channel, or SMS-NIC, that is lightweight
and runs on multiple mobile platforms. Through integration with an existing
mobile platform, we show that the SMS-NIC has little operational overhead
and provides efficient, reliable transport for large messages send over
the
cellular network, where message losses frequently occur.
We motivate the design of the SMS-NIC through a characterization of
SMS
using workloads consisting of bursts of messages between sender and
receiver. This analysis differs from previous SMS studies by focusing
on
transmission patterns that differ from normal cell phone use. Through
this
characterization we show that message delay and message loss are
affected by
the transmission order, the time of day has minimal effect on transmission
rate, delay, and loss, and that losses and high delays significantly
outweigh the rate or message reordering.
|
Date |
February 18, 2008
|
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2007-43 |
Title |
Securing KioskNet: A Systems Approach |
Authors |
Sumair Ur Rahman, Urs Hengartner, Usman Ismail and S. Keshav |
Abstract |
Internet kiosks typically provide weak security guarantees
and
therefore cannot support secure web access or transaction-oriented
applications such as banking and bill payment. We present the design
and implementation of a practical, unobtrusive and easy-to-use
security architecture for KioskNet, a system for low-cost rural
Internet kiosks. Our system uses a combination of physical and
cryptographic mechanisms to protect user data and infrastructure nodes
despite frequent disconnections, untrusted administrators, and large
end-to-end delays. Our contributions include (a) a detailed threat
analysis of rural kiosks, (b) the first comprehensive security
architecture for rural kiosk security, (c) a simple and easy to use
API for securely sending and receiving files over KioskNet (d)
protection for user data both on the file system and in transit, and
(e) a usable public key infrastructure (PKI). Performance measurements
show that our system imposes 100ms of delay during user login at
kiosks and an additional 560ms and 750ms per megabyte of data sent and
received by a kiosk user, respectively, suggesting that the system is
both efficient and usable.
|
Date |
November 15, 2007
|
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2007-44 |
Title |
A Preliminary Report on Tool Support and Methodology for
Feature Interaction Detection |
Authors |
Alma L. Juarez-Dominguez, Nancy A. Day, Richard T. Fanson |
Abstract |
In this report, we describe our effort to create in matlab's
stateflow a set of non-proprietary advanced automotive feature design models,
and to translate these design models into models that can be input to the
model checker SMV. We are interested in verifying the absence of feature
interactions in the integration of the automotive features, as well as
the lack of errors in the design. In the automotive domain, a feature is
a bundle of system functionality recognized by the driver and providing
advanced functionality to the vehicle, for instance, Cruise Control (CC).
Each feature is normally implemented in software and has a degree of control
over the mechanical components that operate the dynamics of the vehicle.
Examples of these mechanical components are brakes, throttle and steering.
We have created our set of non-proprietary feature design models to assess
different techniques and tools that analyze the integration of the features
and their correctness. The translated models will allow us to verify properties
of the automotive features at the same level of description as the design,
so we can ensure that the findings of our analysis are applicable to the
design of the automotive components.
|
Date |
December 18, 2007
|
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2007-45 |
Title |
Internalization on the Toronto Stock Exchange |
Authors |
Michael Lerman and Kate Larson |
Abstract |
This is a study of the impact of internalization on the
Toronto Stock
Exchange (TSX). We present an abstract model of the internalization
process and discuss some observations and conclusions that can be
drawn from the model, including results from a simulation study. We
describe a system that we built in order to use actual order data
from the Toronto Stock Exchange, and report our findings and
experiences from working with this data.
|
Date |
November 15, 2007
|
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2007-46 |
Title |
On Stability and Convergence of Multi-Commodity Networks
and
Services |
Authors |
Jin Xiao and Raouf Boutaba |
Abstract |
The rise of distributed services and user-driven networking
concepts
in recent years poses the critical question of stability. Can a
system operating under non-cooperation and self-interest converge to
a stable state? and how fast? The answers to these questions readily
lend themselves to game theory analysis, and to the study of
congestion games in particular. In the past, much work have been
done on establishing the existence of pure Nash equilibria in
congestion games, and has shown that finding a pure Nash equilibrium
is PLS-complete and hence convergence to a pure Nash
equilibrium is very difficult (exponential time in worst case).
Furthermore, much of the convergence analysis have been carried out
on simple single-commodity game models. In this paper, we attempt to
construct a more realistic multi-commodity congestion game model
suited for distributed service and user-driven networking scenarios.
We introduce the desirability of equilibrium concept that is helpful
in determining whether a system state meets the quality requirements
of the users and services. Desirability is an alternative concept to
price of anarchy. In fact we show the desirability ratio is a
special case of price of anarchy. We then define the
$\alpha$-threshold congestion game whose minimum potential state
corresponds to a desirable equilibrium (if the system permits one)
and we bound its convergence to polynomial time through game
transformation. Finally, we present a mechanism for partial
simultaneous moves. To the best of our knowledge, there has been no
prior establishment of the desirability concept and no bound given
on the convergence of asymmetric multi-commodity congestion games
with exponential cost function.
|
Date |
November 14, 2007
|
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2007-47 |
Title |
Understanding Participatory Media Using Social Networks |
Authors |
Aaditeshwar Seth |
Abstract |
Given the rapid growth of participatory media content
such as blogs, user created videos, and podcasts, there is a need
to understand and answer the following kind of questions: What is
participatory media good for? Does it improve the effectiveness of
news media? If so, then why? What are the underlying processes of
human communication that can explain how people perceive and interpret
the information they gain from participatory media? How can this
understanding be used to design better information search and recommendation
systems for the Internet? We try to answer these questions by examining
participatory media through models built using social networks of
people. We first give a few examples of participatory media which indicate
that participation by people in online public discourse through
blogs and forums can indeed improve the effectiveness of news media:
it helps people gain a better understanding of topics discussed in
mass media, and presents people with diverse viewpoints to avoid media
bias. We then use social networks to propose a model for the
underlying processes of communication to answer why and how participatory
media would improve the effectiveness of news media.
An investigation methodology is then outlined for validation of the model,
and validation results from user-surveys and measurements
done on an online social networking website are presented. Our results
are positive and indicate that a social network based
framework offers a plausible approach to the design of useful applications
for participatory media.
|
Date |
January 14, 2008
|
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2007-48 |
Title |
Optimal Speedup on a Low-Degree Multi-Core Parallel Architecture
(LoPRAM) |
Authors |
Reza Dorrigiv, Alejandro Lopez-Ortiz and Alejandro Salinger |
Abstract |
Modern microprocessor architectures have gradually incorporated
support for parallelism.
In the past the degree of parallelism was rather small and as such it
could be best modeled as
a constant speedup over the traditional RAM model, however, as a consequence
of continued
growth this assumption might no longer hold. For example, with the introduction
of 32- and
64-bit architectures, bit-level parallelism became significant. This
led to the introduction of
the transdichotomous RAM model, for which many algorithms which are faster
in theory and
practice have been developed. Similarly, over the last five years, ma
jor microprocessor manu-
facturers have released road maps for the next decade predicting a rapidly
increasing number
of cores, with upwards of 64 cores per microprocessor by 2015. In such
a setting a sequential
RAM computer no longer accurately reflects the architecture on which
algorithms are being
executed. In this paper we propose a model of low degree parallelism
(LoPRAM) which builds
upon the RAM and PRAM models yet better reflects recent advances in parallel
(multi-core)
architectures. The model has as a goal a combination of fidelity of the
underlying hardware
and ease of programming and analysis. When necessary we make tradeoffs
between what is
achievable in hardware and what is understandable and programmable by
humans/compilers.
The proposed model supports a high level abstraction that simplifies
the design and analysis
of parallel programs. Then we show that in many instances this model,
though in many ways
similar to the classic PRAM, it naturally leads to work-optimal parallel
algorithms via simple
modifications to sequential algorithms. This is in contrast to the PRAM
model, in which the
design and analysis and implementation of work-optimal algorithms proved
to be one of the
biggest challenges in practice for their adoption.
|
Date |
November 26, 2007
|
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2007-49 |
Title |
Extending Typestate Analysis to Multiple Interacting Objects |
Authors |
Nomair A. Naeem and Ondrej Lhotak |
Abstract |
This paper extends static typestate analysis to temporal
specifications
of groups of interacting objects, which are expressed using
tracematches. Unlike typestate, a tracematch state may change due to
operations on any of a set of objects bound by the tracematch. The
paper proposes a lattice-based operational semantics which is proved
equivalent to the original tracematch semantics but is better suited
to
static analysis. The static analysis is presented next, and is proved
sound with respect to the semantics. The analysis computes precise
local points-to sets and tracks the flow of individual objects, thereby
enabling strong state updates. A fully context-sensitive version of
the analysis has been implemented as instances of the IFDS and IDE
algorithms. The analysis was evaluated on tracematches used in earlier
work and found to be very precise. Remaining imprecisions could be
eliminated with more precise modeling of references from the heap and
of
exceptional control flow.
|
Date |
December 20, 2007
|
Comments |
|
Report |
|
|
Adobe PDF |
|