2010 Technical Reports | SCS | UW
[Please remove <h1>]
CS-2010-01 |
Title |
Points-To Analysis with Efficient Strong Updates |
Authors |
Ondrej Lhotak and Kwok-Chiang Andrew Chung |
Abstract |
This paper explores a sweet spot between flow-insensitive and flow-sensitive subset-based points-to analysis.
Flow-insensitive analysis is efficient: it has been applied to million-line programs and its worst-case requirements
are quadratic space and cubic time. Flow-sensitive analysis is precise because it allows strong updates so that
points-to relationships holding in one program location can be removed from the analysis when they no longer hold in other
locations. We propose a "Strong Update" analysis combining both features: it is efficient like flow-insensitive analysis, with the same worst-case bounds, yet its precision benefits from strong updates like flow-sensitive analysis. The key enabling insight is that strong updates are applicable when the dereferenced points-to set is a singleton, and a singleton set is cheap to analyze. The analysis therefore focuses flow sensitivity on singleton sets. Larger sets, which will not lead to strong updates, are modelled flow insensitively to maintain efficiency. We have implemented and evaluated the analysis as an extension of the standard flow-insensitive points-to analysis in the LLVM compiler infrastructure.
|
Date |
Feb 1, 2010 |
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2010-02 |
Title |
Distributed XML Query Processing: Fragmentation, Localization and Pruning |
Authors |
Patrick Kling, M. Tamer
Özsu, Khuzaima Daudjee |
Abstract |
Distributing data collections by fragmenting them is an effective way of improving the scalability of a database system. While the distribution of relational data is well understood, the inique characteristics of XML data and its query model present challenges that require different distribution techniques. In this paper, we show how XML data can be fragmented horizontally and vertically. Based on this, we propose solutions to two of the problems encountered in distributed query processing and optimization on XML data, namely localization and pruning. Localization takes a fragmentation-unaware query plan and converts it to a distribued query plan that can executed at the sites that hold XML data fragments in a distributed system. We then show how the resulting distributed query plan can be pruned so that only those sites are accessed that can contribute to the query result. We demonstrate that our techniques can be integrated into a real-life XML database system and that they significantly improve the performance of distributed query execution.
|
Date |
Sept 14, 2010 |
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2010-03 |
Title |
Understanding the Effects and Implications of
Gesture-Based Interaction for Dynamic Presentations |
Authors |
Adam Fourney, Michael Terry and Richard Mann |
Abstract |
Gesture-based interaction has long been seen as a natural
means of input for electronic presentation systems. However,
gesture-based presentation systems have not been evaluated
in real-world contexts. This paper presents the design
and evaluation of Maestro, a gesture-based presentation
system whose design was informed by observations of realworld
practices. To understand the implications of gesturebased
interaction, Maestro was deployed in a classroom setting
for two weeks. The study results indicate that gestures
that support interaction with content are valued most, as opposed
to those that support slide navigation. Notably, past
systems have only used gestures for slide navigation. Our
study also revealed that the presenter would position, orient,
and conduct himself in ways to more reliably perform gestures
and perceive system feedback, and to avoid accidental
gesture recognition. However, these behaviors negatively
impacted presentation dynamics. Collectively, these results
outline clear directions for future research.
|
Date |
Feb 11, 2010 |
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2010-04 |
Title |
Incorporating Trust in Network Virtualization |
Authors |
Loubna Mekouar, Youssef Iraqi and Raouf Boutaba |
Abstract |
In this paper, we propose a trust management framework
for network virtualization environments. The proposed
framework helps distinguish among the infrastructure
providers based on their past experiences and feedbacks
from service providers. We describe the main components
involved in gathering the feedbacks and managing
the reputation data. We detail the proposed trust system
and we define the concept of Degree of Involvement
of an infrastructure provider in terms of nodes and links.
We also perform a mathematical analysis of the proposed
system. Simulation results confirm the effectiveness of the
proposed trust management system in increasing the service
providers’ satisfaction and reducing the selection of infrastructure
providers that do not fulfill their Service Level
Agreement (SLA). |
Date |
Mar 4, 2010 |
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2010-05 |
Title |
An End-user Domain Specific Model
to Drive Dynamic User Agents Adaptations |
Authors |
Ingrid Oliveira de Nunes, Simone Diniz Junqueira Barbosa, Carlos José Pereira de Lucena |
Abstract |
Modeling automated user tasks based on agent-oriented approaches is a promising but
challenging task. Personalized user agents have been investigated as a potential way of addressing
this issue. Most of recent research work has focused on learning, eliciting and reasoning about user
preferences and profiles. In this paper, our goal is to deal with the engineering of such systems,
barely discussed in the literature. In this context, we present a high-level domain specific model
whose aim is to give users the power of customizing and dynamically adapting their user agents. In
addition, we propose a general architecture to develop agent-based systems able to be customized
to users.
|
Date |
Mar 3, 2010 |
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2010-06 |
Title |
Data Driven Smartphone Energy Level Prediction |
Authors |
Earl Oliver, Prof. S. Keshav |
Abstract |
The body of mobile applications is growing at a near-exponential
rate; many applications are increasing in both scale, complexity,
and their demand for energy. The energy density of
smartphone batteries is increasing at a comparably insignificant
rate, and thus inhibits the practicality of these applications.
Despite the importance of energy to mobile applications,
energy is rarely considered by mobile applications
developers primarily due to lack of knowledge about users
and the absence of energy-aware developer tools.
We conduct a large-scale user study to measure the energy
consumption characteristics of 15500 BlackBerry smartphone
users. Our dataset is several orders of magnitude larger than
any previous work. Using this dataset we build the En-
ergy Emulation Toolkit (EET) that allows developers to
evaluate the energy consumption requirements of their applications
against real user energy traces. The EET computes
the successful execution rate of energy-intensive applications
across all users, specific devices, and specific smartphone
user types. We also consider active adaptation to
energy constraints. By classifying smartphone users based
on their charging characteristics we demonstrate that energy
level can be predicted within 72% accuracy a full day in advance,
and through an Energy Management Oracle energy
intensive applications can adapt their execution to achieve a
near optimal successful execution rate.
|
Date |
May 15, 2010 |
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2010-07 |
Title |
Modeling Context-Aware Distributed Event-Based Systems |
Authors |
Eduardo S. Barrenechea, Rolando Blanco, and Paulo Alencar |
Abstract |
Emerging applications are becoming increasingly dynamic,
adaptive and context-aware in areas such as
just-in-time location-based m-commerce, situational
health monitoring, and dynamic social networking
collaboration. Although numerous systems and implementation
infrastructures have been proposed to
deal with some features of such systems, there is
a lack of higher-level modeling abstractions and associated
component metamodels that combine dynamic
features, adaptation mechanisms, and contextawareness.
In this report we propose a metamodel for contextaware
distributed event-based systems that supports
novel forms of event-based context-aware interactions
such as context event schemas, component interfaces
that react to context, event and context interactions
at the interface level and subscriptions based on both
events and context. Our approach also supports requirements
for context-aware systems proposed in the
literature. The applicability of the approach is illustrated
by showing how existing context-aware systems
can be modeled using our metamodel.
|
Date |
April 12, 2010 |
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2010-08 |
Title |
Reducing Waste in Data Center Network Upgrades |
Authors |
Andrew R. Curtis, S. Keshav, and Alejandro Lopez-Ortiz |
Abstract |
Fundamental limitations of traditional data center network
architectures have led to the development of architectures
that provide enormous bisection bandwidth for up to hundreds
of thousands of servers. To implement any of these solutions,
a legacy data center operator usually must replace
most switches in their network—a tremendously wasteful
proposition in terms of capital costs, energy used in manufacturing,
and raw materials. To enable such upgrades, we
introduce a construction for the first data center network topology
that supports heterogeneous switches while providing
guarantees on bisection bandwidth and using optimal link
capacity. We finish by describing an algorithm that finds the
minimum cost upgrade path for an existing network to support
a given traffic load.
|
Date |
April 12, 2010 |
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2010-09 |
Title |
Factored HMMs for Bimanual, Context-Dependent Gestures |
Authors |
Adam Fourney, Richard Mann, Michael Terry |
Abstract |
As we expand our use of hand gestures for interacting
with computational devices, the spatial context in which
gestures are performed becomes an increasingly important
feature for interpreting user intent. In this paper, we demonstrate
how spatial context, and bimanual coordinated hand
motion, can be efficiently modeled using a factored hidden
Markov model. This factorization, guided by topological
constraints relating to feature extraction, reduces the number
of modeling parameters by two orders of magnitude
compared to an unfactored model. We used these factored
models when constructing a gesture-based presentation system
called Maestro. We then performed a series of experiments
to evaluate the performance of Maestro’s recognizer
with and without the spatial and bimanual context features. |
Date |
May 28, 2010 |
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2010-10 |
Title |
Feature and Class Models in Clafer:Mixed, Specialized, and Coupled |
Authors |
Kacper Bak, Krzysztof Czarnecki, and Andrzej Wasowski |
Abstract |
We present Clafer, a class modeling language with first-class
support for feature modeling.We designed Clafer as a concise notation for
class models, feature models, mixtures of class and feature models (such
as components with options), and models that couple feature models and
class models via constraints (such as mapping feature configurations to
component configurations). Clafer also allows arranging models into multiple
specialization and extension layers via constraints and inheritance.
We identified four key mechanisms allowing a class modeling language to
express feature models concisely and show that Clafer meets its design
objectives using a sample product line.
|
Date |
June 25, 2010 |
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2010-11 |
Title |
On-the-Fly Counterexample Abstraction for Model Checking
Invariants |
Authors |
Alma L. Juarez-Dominguez, Nancy A. Day |
Abstract |
Verification of invariants using model checking can find errors, inconsistencies, and contradictions in
a model. In some applications, it is beneficial to see at once all paths of the model in which the invariant
fails. Thus, instead of the traditional cycle of find bug – fix bug – re-run model checker, we would like
to produce all counterexamples prior to fixing the bug. However, the set of all counterexamples is often
too large to generate or comprehend. We define a series of abstractions that each represent the complete
set of counterexamples in a reduced set of equivalence classes. Our abstractions are based on the control
states and transitions of an extended finite state machine (EFSM) model. We show how our abstractions
can be represented as LTL properties and, therefore, reduce the set of counterexamples on-the-fly during
model checking.
|
Date |
July 5, 2010 |
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2010-12 |
Title |
LEGUP: Using Heterogeneity to Reduce the Cost of Data
Center Network Upgrades |
Authors |
Andrew R. Curtis, S. Keshav and Alejandro Lopez-Ortiz |
Abstract |
Fundamental limitations of traditional data center network
architectures have led to the development of architectures
that provide enormous bisection bandwidth for up to hundreds
of thousands of servers. Because these architectures
rely on homogeneous switches, implementing one in a legacy
data center usually requires replacing most existing switches.
Such forklift upgrades are typically prohibitively expensive;
instead, a data center manager should be able to selectively
add switches to boost bisection bandwidth. Doing so adds
heterogeneity to the network’s switches and heterogeneous
high-performance interconnection topologies are not well understood.
Therefore, we develop the theory of heterogeneous
Clos networks. We show that our construction needs only as
much link capacity as the classic Clos network to route the
same traffic matrices and this bound is the optimal. Placing
additional equipment in a highly constrained data center is
challenging in practice, however.We propose LEGUP to design
the topology and physical arrangement of such network
upgrades or expansions. Compared to current solutions, we
show that LEGUP finds network upgrades with more bisection
bandwidth for half the cost. And when expanding a data
center iteratively, LEGUP’s network has 265% more bisection
bandwidth than an iteratively upgraded fat-tree.
|
Date |
July 7, 2010 |
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2010-13 |
Title |
Recognizing handwritten mathematics via fuzzy parsing |
Authors |
Scott MacLean and George Labahn |
Abstract |
We present a new approach to multi-dimensional parsing using relational grammars
and fuzzy sets. A fast, incremental parsing algorithm is developed, motivated by the
two-dimensional structure of written mathematics. Our approach makes no hard decisions; recognized math expressions are reported to the user in ranked order. A
exible
correction mechanism enables users to quickly choose the correct math expression in
case of recognition errors or ambiguity.
|
Date |
June 13, 2010 |
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2010-14 |
Title |
Talk is Cheap(er): Mitigating DoS and Byzantine
Attacks in Sensor Networks |
Authors |
Valerie King, Jared Saia and Maxwell Young |
Abstract |
Sensor networks are extremely vulnerable to denial-of-service (DoS) attacks due to their shared communication medium and the constrained power supply of the devices. We examine the scenario where one player attempts to send a message m to another player in a single-channel network. Assume an adversary that jams for T instances where T is unknown to either player and where cost is measured in terms of energy expenditure. We give a protocol guaranteeing delivery of m while the ratio of either player's expected cost to the adversary's cost is O(T^(varphi-2)) ~ O(T0.382) where varphi= (1+ sqrt(5))/2 is the golden ratio.
Our result implies that to prevent communication of m, the adversary incurs an asymptotically higher cost than the expected cost incurred by either correct player. This property holds even when the receiver suffers a Byzantine fault. We extend our analysis to multiple receivers and later, to a multi-hop setting where both senders and receivers can suffer Byzantine faults. Our work pertains to a wide variety of adversaries that are energy constrained. Notably, we can tolerate an adaptive adversary who launches attacks using knowledge of the past actions of all players. Moreover, in networks with a sufficient amount of communication traffic, we can tolerate a reactive adversary who may detect a transmission in the current time slot and then decide to jam. Finally, we apply our result to the fundamental network communication problem of reliable broadcast which deals with conveying a message from one player to all other players in the network. In a popular sensor network grid model, we give a reliable broadcast protocol that, in comparison to previous work in the grid, offers improved resilience to DoS attacks.
|
Date |
November 8, 2010 |
Comments |
Originally posted August 3, 2010 updated November 8, 2010 - 3 minor typos corrected |
Report |
|
|
Adobe PDF |
|
CS-2010-15 |
Title |
Paging for Multicore (CMP) Caches |
Authors |
Alejandro Lopez-Ortiz and Alejandro Salinger |
Abstract |
In the last few years, multicore processors have become the dominant processor architec
ture. While cache eviction policies have been widely studied both in theory and practice for
sequential processors, in the case in which various simultaneous processes use a shared cache
the performance of even the most common eviction policies is not yet fully understood, nor do
we know if current best strategies in practice are optimal. In particular, there is almost no
theoretical backing for the use of current eviction policies in multicore processors. Recently, a
work by Hassidim [14] initiated the theoretical study of cache eviction policies for multicore pro
cessors under the traditional competitive analysis, showing that LRU is not competitive against
an offline policy that has the power of arbitrarily delaying requests sequences to its advantage.
In this paper we study caching under the more conservative model in which requests must be
served as they arrive. We perform a thorough all-to-all comparison of strategies providing lower
and upper bounds for the ratios between performances. We show that if the cache is partitioned,
the partition policy has a greater influence on performance than the eviction policy. On the
other hand, we show that sharing the cache among cores is in general better than partitioning
the cache, unless the partition is dynamic and changes frequently, in which case shared cache
strategies and dynamic partitions are essentially equivalent when serving disjoint requests.
|
Date |
July 29, 2010 |
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2010-16 |
Title |
Autonomous Thermal Control System |
Authors |
Omid Ardakanian |
Abstract |
Building heating and cooling accounts for nearly 18-24% of
all energy usage. Building energy management can reduce
both its operating costs and its carbon footprint. We propose
the use of a factored Partially Observable Markov Decision
Process (POMDP) to model and efficiently control a commercial
building heating system. We use supervised learning and
ground truth data to capture the parameters of the POMDP.
We show that our solution outperforms the HVAC system in
terms of energy efficiency.
|
Date |
October 5, 2010 |
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2010-17 |
Title |
Direct Adaptive Control of Electricity Demand |
Authors |
S. Keshav and C. Rosenberg
|
Abstract |
The legacy electrical grid upper-bounds a customer’s energy
demand using a circuit breaker. The breaker is conservatively
sized so that it rarely ‘trips,’ interrupting supply
and inconveniencing the customer. This results in a system
whose peak load can be much higher than its average load.
Such a system can be made reliable only by being provisioned,
at great cost, for infrequent peaks. The goal of demand
management is to reduce the peak load and the resulting
infrastructure provisioning cost. The most widely used
such approach is to price a unit of energy higher during peak
times, giving an incentive to consumers to reduce their demand
during these times. Although this does reduce the peak
load by 2-10%, this ‘human in the loop’ approach is onerous
and frequently ineffective. Instead, we propose a radically
different approach that draws upon the Internet traffic
management approaches of proactive and reactive congestion
control for fine-grained management of customer demand
without human intervention. We show that this direct
adaptive control of electricity demand can significantly reduce
the peak load and therefore infrastructure costs. Moreover,
it allows individual users to place a greater load than
in the existing system, as long as it does not hurt other customers
or the grid itself, thus increasing revenues. We believe
that our approach has the potential to avert grid congestion,
reduce capital costs, and eliminate a portion of the
carbon footprint of the grid.
|
Date |
September 27, 2010 |
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2010-18 |
Title |
Characterizing the Usability of Interactive Applications
Through Query Log Analysis |
Authors |
Adam Fourney,
Richard Mann
and Michael Terry |
Abstract |
People routinely rely on Internet search engines to support
their use of interactive applications, making query logs a rich
source of data cataloguing the day-to-day tasks and needs of
a user base. In this paper, we introduce an automated process
for harvesting, ordering, labeling, and grouping search
queries related to any publicly available interactive system.
The end result is a data set that can complement and augment
data collected through traditional usability methods. We call
this process CUTS—characterizing usability through search.
The labeled, ordered data produced by CUTS can be assembled
in minutes, is timely, has a high degree of ecological
validity, and is arguably less prone to self-selection bias than
traditional usability methods. We describe this process and
demonstrate applications of its use with a handful of interactive
systems.
|
Date |
September 28, 2010 |
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2010-19 |
Title |
Understanding How Users Express Preferences: a User Study
|
Authors |
Ingrid Oliveira de Nunes,Simone Diniz Junqueira Barbosa and Carlos José Pereira de Lucena |
Abstract |
With the exponential growth of information available on the Web and our
24/7 availability through social networks, instant messengers and e-mail, people are facing
the challenge of processing huge amounts of data and playing multiple roles at the same
time. Personal Assistance Software (PAS) aims at aiding users in these tasks from making
recommendations to acting on their behalf. Even though extensive research has been
done on improving such systems, little attention has been given to their acceptance by
users. With the goal of building users' trust on PAS, we aim at developing an enduser
language in order to empowering users to control and instruct their PAS to provide
task delegation. However, such language would be meaningless if users are not able to
adequately express their preferences. This paper presents a user study whose goal was
to provide a deeper understanding of how users express their preferences. Seven research
questions are investigated, including how the knowledge about a domain inuences the
expression of preferences and how users change them after being exposed to decisionmaking
situations. This study allowed us to identify kinds of support users need to better
express their preferences so that a system can be able to act on their behalf.
|
Date |
October 14, 2010 |
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2010-20 |
Title |
Answer Set Programming or Hypercleaning:
Where does the Magic Lie in Solving Maximum Quartet Consistency? |
Authors |
Fathiyeh Faghih and Daniel G. Brown
|
Abstract |
Answer set programming (ASP) approach is an efficient approach for solving the maximum quartet consistency problem. We distinguish two factors that affect the efficiency of a recent ASP approach
for solving maximum quartet consistency problem; answer set programming itself and a variety of preprocessing steps. In this paper, we propose
a method for applying one of the preprocessing steps used in the ASP
approach to an alternative algorithm. Our results show that the preprocessing step gives the main contribution to efficiency of the ASP approach
for data sets with less than 20 taxa. We also identify an issue in the ASP
approach for solving the maximum quartet consistency problem.
Key words: Maximum Quartet Consistency, Answer Set Programming, Edge Detection Strategy, Efficiency
|
Date |
November 3, 2010 |
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2010-21 |
Title |
Orthogonal cartograms with few corners per face |
Authors |
Therese Biedl and Lesvia Elena Ruiz Vel´azquez |
Abstract |
We give an algorithm to create orthogonal drawings of 3-connected 3-regular planar graphs
such that each interior face of the graph is drawn with a prescribed area. This algorithm
produces a drawing with at most 12 corners per face and 4 bends per edge, which improves the
previous known result of 34 corners per face.
|
Date |
December 3, 2010 |
Comments |
|
Report |
|
|
Adobe PDF |
|
CS-2010-22 |
Title |
Interaction-Aware Scheduling of Report Generation Workloads |
Authors |
Mumtaz Ahmad , Ashraf Aboulnaga, Shivnath Babu and Kamesh Munagala
|
Abstract |
The typical workload in a database system consists
of a mix of multiple queries of different types that run
concurrently. Interactions among the different queries in a
query mix can have a significant impact on database performance.
Hence, optimizing database performance requires
reasoning about query mixes rather than considering queries
individually. Current database systems lack the ability to do
such reasoning. We propose a new approach based on planning
experiments and statistical modeling to capture the impact
of query interactions. Our approach requires no prior
assumptions about the internal workings of the database system
or the nature and cause of query interactions; making it
portable across systems.
To demonstrate the potential of modeling and exploiting
query interactions, we have developed a novel interactionaware
query scheduler for report-generation workloads. Our
scheduler, called QShuffler, uses two query scheduling algorithms
that leverage models of query interactions. The first
algorithm is optimized for workloads where queries are submitted
in large batches. The second algorithm targets workloads
where queries arrive continuously, and scheduling decisions
have to be made on-line. We report an experimental
evaluation of QShuffler using TPC-H workloads running on
IBM DB2. The evaluation shows that QShuffler, by modeling
and exploiting query interactions, can consistently out-
perform (up to 4x) query schedulers in current database systems.
Keywords: Business Intelligence - Report generation-
Query interactions- Scheduling- Experiment-driven
performance modeling- Workload management.
|
Date |
December 16, 2010 |
Comments |
|
Report |
|
|
Adobe PDF |
|