2001 technical reports

CS-2001-26
Title	Finding Hidden Independent Sets in Interval Graphs
Authors	T. Biedl, B. Brejova, E. D. Demaine, A. M. Hamel, A. Lopez-Ortiz, T. Vinar
Abstract	We design efficient competitive algorithms for discovering information hidden by an adversary. Specifically, consider a game in a given interval graph $G$ in which the adversary chooses an independent set $X$ in $G$. Our goal is to discover this hidden independent set $X$ by making the fewest queries of the form ``Is point $p$ covered by an interval in $X$?'' Our interest in this problem stems from two applications: experimental gene discovery with PCR technology and the game of Battleship (in a one-dimensional setting). We provide adaptive algorithms for both the verification scenario (given an independent set, is it $X$?) and the discovery scenario (find $X$ without any information). Under some assumptions on the interval graph, these algorithms use an asymptotically optimal number of queries in every instance.
Date	December 2001
Report	Finding Hidden Independent Sets in Interval Graphs (PDF)			Compressed PostScript: Finding Hidden Independent Sets in Interval Graphs (PS.Z)

CS-2001-25
Title	A Direct Algorithm to Constract the Minimal Telescopers for Rational Functions (q-Difference Case)
Authors	H.Q. Le, S.A. Abramov and K.O. Geddes
Abstract	In this paper we present a direct algorithm to construct the minimal telescopers for rational functions for the q-difference case.
Date	November 2001
Report	A Direct Algorithm to Constract the Minimal Telescopers for Rational Functions (q-Difference Case) (PDF)		Compressed PostScript: A Direct Algorithm to Constract the Minimal Telescopers for Rational Functions (q-Difference Case) (PS.Z)

CS-2001-24
Title	HypergeometricSum: A Maple Package for Finding Closed Forms of Indefinite and Definite Sums of Hypergeometric Type
Authors	H.Q. Le, S.A. Abramov and K.O. Geddes
Abstract	We describe the Maple module HypergeometricSum which provides various tools for finding closed forms of indefinite and definite sums of hypergeometric type, and for certifying a large class of combinatorial identities. The document is intended both as an an elementary introduction to the subject and as a reference manual for the module.
Date	November 2001
Report	HypergeometricSum: A Maple Package for Finding Closed Forms of Indefinite and Definite Sums of Hypergeometric Type (PDF)			Compressed PostScript: HypergeometricSum: A Maple Package for Finding Closed Forms of Indefinite and Definite Sums of Hypergeometric Type (PS.Z)

CS-2001-23
Authors	L. Stanchev
Abstract	In this paper we introduce bag relational algebra with aggregation over a particular representation of incomplete information called c-tables. The c-table representation is an extension of the Codd relational model with richer semantics for null values. The reason c-tables were chosen for our exploration is that they are the least expressive relational representation of incomplete information over which relational algebra is closed and can be "well defined". We justify the need for duplicate preserving relational algebra over this representation of incomplete information by introducing aggregation over c-tables with duplicates.
Date	October 2001
Comments	This report was generated as part of a project for a 700 level course at the University of Waterloo, Waterloo, ON, Canada
Report	Compressed PostScript: Codd tables with duplicates (PS)		Codd tables with duplicates (PDF)

CS-2001-22
Title	Evolution of Recurrent Neural Networks to Control Autonomous Life Agents
Authors	T. Abou-Assaleh
Abstract	Studies of artificial life (alife) attempt to simulate simple living beings. On the other hand, autonomous agents researchers are interested in building agents that are able to complete a particular task without supervision. In this research, these two areas of artificial intelligence are combined together into what we call "Autonomous Life Agent" (ALA). ALA is an artificial agent that is sent to some environment to live in without any supervision or any predefined behaviour rules. The primary goal of the agent is to learn how to survive in the artificial environment it lives in. In this research, we utilize a recurrent neural network to determine the agent's actions. A novel ALA Training System was developed that evolves recurrent neural networks using genetic algorithms. The resulting agents are capable of living in multiple similar worlds starting from random initial positions as well as in worlds that are unseen during the training.
Date	October 2001
Comments	Source code of the ALA Training System is available upon request. Appears in: Tony Abou-Assaleh, Jianna Zhang, and Nick Cercone. 2001. "Evolution of Recurrent Neural Networks to Control Autonomous Life Agents." Proceedings of the Genetic and Evolutionary Computation Conference, GECCO-2001. San Francisco, CA: Morgan Kaufmann Publishers. Page 903. Please direct all comments and suggestions to the author
Report	Evolution of Recurrent Neural Networks to Control Autonomous Life Agents (PDF)

CS-2001-21
Authors	A. Lopez-Ortiz
Date	October 2001
Report	Only available in printed format

CS-2001-20
Authors	A. Lopez-Ortiz
Date	October 2001
Report	Only available in printed format

CS-2001-19
Authors	P.A.S. Ward
Date	#issued July 4, 2001,
Report	Only available in printed format

CS-2001-18
Title	A Behavioral Analysis Approach to Pattern-Based Composition
Authors	Jing Dong, Paulo S.C. Alencar and Donald D. Cowan
Date	June 2001
Report	A Behavioral Analysis Approach to Pattern-Based Composition (PDF)

CS-2001-17
Title	User's Manual as a Requirements Specification
Authors	D.M. Berry, K. Daudjee, J. Dong, M.A. Nelson and T.Nelson
Date	May 2001
Report	User's Manual as a Requirements Specification (PDF)			Compressed PostScript: User's Manual as a Requirements Specification (PS.Z)

CS-2001-16
Authors	D. Toman
Date	# issued May 16th
Report	Only available in printed format

CS-2001-15
Title	Finding Shortest Paths in Large Network Systems
Authors	E.P.F. Chan and N. Zhang
Date	May 2001
Report	Finding Shortest Paths in Large Network Systems (PDF)			Compressed PostScript: Finding Shortest Paths in Large Network Systems (PS.Z)

CS-2001-14
Title	Optimal Arrangement of Leaves in the Tree Representing Hierarchical Custering of Gene Expression Data
Authors	T. Biedl, B. Brejova, E.D. Demaine, A.M. Hamel and T. Vinar
Abstract	In this paper, we study how to present gene expression data to display similarities by trying to find a linear ordering of genes such that genes with similar expression profiles will be close in this ordering. In general, finding the best possible order is intractable. Therefore we assume that hierarchical clustering has been applied to the gene expression profiles, and show that the best order respecting the clustering can be computed efficiently. We perform experiments comparing the optimal order to several other methods. The implementation of the algorithm, as well as a simple program for viewing hierarchically clustered expression array data and the complete results of our experiments are available at http://monod.uwaterloo.ca/supplements/01expr/
Comments	The paper will be submitted to WABI 2001.
Report	Optimal Arrangement of Leaves in the Tree Representing Hierarchical Custering of Gene Expression Data (PDF)			Compressed PostScript: Optimal Arrangement of Leaves in the Tree Representing Hierarchical Custering of Gene Expression Data (PS.Z)

CS-2001-13
Title	Process-Based Representation and Analysis of Framework Instantiation
Authors	Paulo S. C. Alencar, Donald D. Cowan, Toacy C. Oliviera, Carlos J. P. Lucena
Abstract	Object-oriented frameworks are currently regarded as a promising technology for reusing designs and implementations. However, developers find there is still a steep learning curve when extracting the design rationale and understanding the framework documentation during framework instantiation. Thus, instantiation is a costly process in terms of time, people and other resources. These problems raise a number of questions including: "How can frameworks be instantiated more quickly and with greater ease? How can the same high-level design abstractions that were used to develop the framework be used during framework instantiation instead of using source code as is done currently? How can we capture the designers' knowledge of the framework in order to compensate for the loss of key development personnel? How can we raise the level of abstraction in which the framework evolution and instantiation is expressed, reasoned about and implemented?" In this paper we present a process-based approach to framework instantiation that addresses these issues. Our main goal is to represent the framework architectural design models in an explicit and declarative way, and support changes to this architecture based on explicit instantiation processes and activities while maintaining system integrity, invariants, and general constraints. In this way, the framework instantiation and evolution can be performed in a valid and controlled way. To accomplish our goal, we introduce a process-oriented description of framework instantiation as well as a formal specification of these processes that allows us to reason about instantiation using model checking techniques. Discovering instantiation errors and analyzing alternative architectures and designs early in the development process could certainly lower the cost and effort in fixing them. Various forms of analysis can be performed using our approach to check properties such as: structural evolution properties, pre- and post-conditions of an instantiation, the validity of the processes associated with extension points, the order of the instantiation processes, the safety and liveness of these processes, reachability, and deadlocks. We illustrate our approach with DrawingTool, a framework to provide drawing features of a case tool.
Report	Process-Based Representation and Analysis of Framework Instantiation (PDF)

CS-2001-12
Title	A Framework For Community Information Systems
Authors	Martin Luo, Paulo S. C. Alencar, Donald D. Cowan
Abstract	In this paper we present a generic framework architecture for web-based community information systems (CIS). The framework has an open architecture based on COTS (commercial-off-the-shelf) software components and network technologies. We discuss how a component-based approach, a layered architecture model, and design patterns can be used to provide a common framework for CIS. The CIS framework architecture results in significant benefits that include reuse, a flexible user interface, powerful search mechanisms and an integrated and scalable architecture, XML and rule-based StyleSheet languages are used for storage, information search and graphical presentation at the server or client. The overall framework architecture, its individual components and the interaction among these components are outlined.
Report	A Framework For Community Information Systems (PDF)

CS-2001-11
Title	Monotonicity Preserving and Total Variaion Diminishing Multigrid Time Stepping Methods
Authors	A. Jameson and J.W.L. Wan
Abstract	We propose a fast multiplicative and additive multigrid time stepping schemes for solving linear and nonlinear wave equations in one dimension. The idea is based on an upwind biased interpolation and residual restriction operators, and a nonstandard coarse grid update formula for linear equations. We prove that the two-level schemes preserve monotonicity and are total variation diminishing, and the same results hold for the multilevel additive scheme. We generalize the idea to nonlinear equations by solving local Riemann problems. We demonstrate numerically that these schemes are essentially nonoscillatory, and that the optimal speed of wave propagation of 2^M-1 is achieved, where M is the number of grids.
Date	April 2001
Comments	Conference presentation: Tenth Copper Mountain Conference on Multigrid Methods Also appear in virtual proceedings at MGNET.
Report	Monotonicity Preserving and Total Variaion Diminishing Multigrid Time Stepping Methods (PDF)			Compressed PostScript: Monotonicity Preserving and Total Variaion Diminishing Multigrid Time Stepping Methods (PS.Z)

CS-2001-10
Title	Using OpenGL for Video Streaming
Authors	P. Gilhuly
Abstract	The process of streaming digital video is difficult due to the large amount of data inherent. Previously, without specialized hardware, video output has been limited to low resolution or slow playback, as measured by frames per second. Many of the functions of a video decoder can be mapped to the capabilities of a graphics card. However, for the graphics work involved with digital video a graphics card is required while a video decoder card is not. This thesis proposes a method to use common graphics cards, through the OpenGL API, for the task of video streaming.
Date	April 2001
Comments	I have included the video sequences I used to test my compressor and streamer as well as a PostScript and a PDF format of my thesis. Under the directories Hier and BFI you will find uncompressed AVIs (subsequently gzipped) of the frames generated by my streamer. Under the directory MPEG you will find MPEG 1 encoding of the three movies, compressed using default settings. MPEG/* - MPEG1 encoded versions of animations BFI/* - MPOG encoded versions using brute force search Hier/* - MPOG encoded versions using hierarchical search thesis.pdf - PDF version of thesis
Report	MPEG: ball, geri, tree (MPV)	Compressed AVI: geri_br (GZIP), tree_br (GZIP), geri_hi (GZIP), tree_hi (GZIP)	Using OpenGL for Video Streaming (PDF)	Compressed PostScript: Using OpenGL for Video Streaming (GZIP)

CS-2001-09
Title	Evaluation of Buffer Queries in Spatial Databases
Authors	E.P.F. Chan
Date	April 2001
Report	Evaluation of Buffer Queries in Spatial Databases (PDF)			Compressed PostScript: Evaluation of Buffer Queries in Spatial Databases (PS.Z)

CS-2001-08
Title	Just-in-time subgrammar extraction for HPSG
Authors	V. Keselj
Abstract	We define the basic problem of subgrammar extraction for head-driven phrase structure grammars (HPSG) in the following way: Given a large HPSG grammar G and a set of words W, find a small subgrammar of G that accepts the same set of sentences from W^* as G, and for each of them produces the same parse trees. The set of words W is obtained from a piece of text. Additionally, we assume that this operation is done ``just-in-time,'' i.e., just before parsing the text. This application requires that this operation be done in an automatic and efficient way. After defining the problem in the general framework, we discuss the problem for context-free grammars (CFG), and give an efficient algorithm for it. We show that finding the smallest subgrammar for HPSGs is an NP-hard problem, and give an efficient algorithm that solves an easier, approximate version of the problem. We also discuss how the algorithm can be efficiently implemented.
Date	March 2001
Report	Just-in-time subgrammar extraction for HPSG (PDF)			Compressed PostScript: Just-in-time subgrammar extraction for HPSG (PS.Z)

CS-2001-07
Authors	A. Garcia and D.D. Cowan
Date	March 2001
Report	Only available in printed format

CS-2001-06
Title	Text Structure Recognition using a Region Algebra
Authors	M. Young-Lai
Abstract	We consider the problem of incrementally developing a parser for text structure. This means building the parser specification a piece at a time while simultaneously developing our understanding of the text. We argue that existing solutions have usability and efficiency problems for this application and propose an alternative approach based on the type of region algebra that is often used as a query language for text databases. This is an appropriate interface for incremental development, but has no efficient batch parsing model such as those that exist for grammars. In this thesis, we propose an efficient batch parsing model and characterize the region algebras to which it applies.
Date	February 2001
Report	Text Structure Recognition using a Region Algebra (PDF)			Compressed PostScript: Text Structure Recognition using a Region Algebra (GZIP)

CS-2001-05
Title	Modular HPSG
Authors	V. Keselj
Abstract	We present a detailed description of the modular Head-driven Phrase Structure Grammar (HPSG). Although the notion of modularity is known in the area of programming languages, and it is described for context-free grammars (CFG), this is the first attempt of defining modularity for HPSGs. We describe and formally define modularity for an HPSG-type grammar, and we illustrate its application on an example.
Date	February 2001
Report	Modular HPSG (PDF)			Compressed PostScript: Modular HPSG (PS.Z)

CS-2001-04
Title	Optimum Multi-dimensional Interval Routing Schemes on Networks with Dynamic Cost Links
Authors	Y. Ganjali
Abstract	One of the fundamental tasks in any distributed computing system is routing messages between pairs of nodes. An Interval Routing Scheme (IRS) is a space efficient way of routing messages in a network. The problem of characterizing graphs that support an IRS is a well-known problem and has been studied for some variants of IRS. It is natural to assume that the costs of links may vary over time (dynamic cost links) and to try to find an IRS which routes all messages on shortest paths (optimum IRS). In this paper, we study this problem for a variant of IRS in which the labels assigned to the vertices are d-ary integer tuples (d-dimensional IRS). The only known results in this case are for specific graphs like hypercubes, n-dimensional grids, or for the one-dimensional case. We give a complete characterization for the class of networks supporting multi-dimensional strict and linear (which is a variation of IRS) interval routing schemes with dynamic cost links. Keywords: Interval routing, networks, routing algorithms, dynamic, multi-dimensional, characterization.
Date	March 2001
Report	Optimum Multi-dimensional Interval Routing Schemes on Networks with Dynamic Cost Links (PDF)			Compressed PostScript: Optimum Multi-dimensional Interval Routing Schemes on Networks with Dynamic Cost Links (PS.Z)

CS-2001-03
Title	Characterization of Networks Supporting Multi-dimensional Linear Interval Routing Schemes
Authors	Y. Ganjali
Abstract	An Interval routing scheme (IRS) is a well-known, space efficient routing strategy for routing messages in a distributed network. In this scheme, each node of the network is assigned an integer label and each link at each node is labeled with an interval. The interval assigned to a link l at a node v indicates the set of destination addresses which should be forwarded through l from v. A Multi-dimensional Interval Routing Scheme (MIRS) is a generalization of IRS in which each node is assigned a multi-dimensional label (which is a list of d integers for the d-dimensional case). The labels assigned to the links of the network are also multi-dimensional (a list of d one-dimensional intervals). The class of networks supporting linear IRS (in which the intervals are not cyclic) is already known for the one-dimensional case [FG94]. In this paper, we generalize this result and completely characterize the class of networks supporting linear MIRS (or MLIRS) for a given number of dimensions d. We show that by increasing d, the class of networks supporting MLIRS is strictly expanded. We also give a characterization of the class of networks supporting strict MLIRS, which is a modified version of MLIRS. Key words:Computer networks, interval routing schemes, graph theory, multi-dimensional, characterization.
Date	March 2001
Comments	Will be submitted to SIROCCO 2001
Report	Characterization of Networks Supporting Multi-dimensional Linear Interval Routing Schemes (PDF)			Compressed PostScript: Characterization of Networks Supporting Multi-dimensional Linear Interval Routing Schemes (PS.Z)

CS-2001-02
Title	Reducing the Size of Auxiliary Data Needed to Maintain Materialized Views by Exploiting Integrity Constraints
Authors	L. Stanchev
Abstract	A data warehouse consists of materialized views, which contain derived data from several data sources. The advantage of using materialized views is that they contain the results of standard queries and therefore when such queries are posed, the data sources those queries are based on, which are usually costly to access because of their size and remoteness, don't have to be accessed. However in order for the materialized views to contain up-to-data data they have to be updated periodically. Such synchronization of the materialized views with the data sources could be slow if the later have to be queries for a correct update to be done - i.e. if just the data for the changes done to the data sources is insufficient. A way querying the data sources during materialized view update can be avoided is by storing any data from the data sources, which could be relevant to an update of the materialized views, on the data warehouse site. Such auxiliary data is stored in auxiliary views and has the characteristic that it makes the data stored at the data warehouse self-maintainable, i.e. it can be update correctly from the log of changes done to the the data sources. In this paper we propose unique algorithms for keeping the size of such auxiliary views relatively small. This is achieved by using the additional information about the integrity constraints that hold on the data sources. More specifically we use an object-oriented model to describe a database schema and a subset of OQL as our query language. The proposed in the paper algorithms produce auxiliary views that make a materialized view self-maintainable relative to the three possible operations on a datbase: addition, deletion and object update.
Date	September 2001
Comments	It was submitted to CASCON 2000 as a paper but it was not accepted. The reported was created as a result of 700 level distributed database course at the University of Waterloo thought by Prof. Ken Salem.
Report	Reducing the Size of Auxiliary Data Needed to Maintain Materialized Views by Exploiting Integrity Constraints (PS)			Compressed PostScript: Reducing the Size of Auxiliary Data Needed to Maintain Materialized Views by Exploiting Integrity Constraints (PS.Z)

CS-2001-01
Title	Issues in Scalable Distributed-System Management
Authors	P.A.S. Ward
Abstract	Distributed-system management concerns the observation of a distributed computation and then using the information gained by that observation to control the computation. This necessitates collecting the information required to determine the partial order of execution, and then reasoning about that partial order. This in turn requires a partial-order data structure and, if the reasoning is being performed by a human, a system for visualizing that partial order. Both creating such a data structure and visualizing it are hard problems. Current partial-order data structure techniques suffer various shortcomings. Potentially scalable mechanisms, such as Ore timestamps, are static. Dynamic algorithms, on the other hard, either require a significant search operation to answer basic questions, or they require a vector of size equal to the width of the partial order for each element stored in the order. Scalable visualization of a partial order is hard for the same reasons that drawing any large graph is hard. Any visualization that will be meaningful to a user requires appropriate abstractions on the data structure, while preserving the core meaning of the data structure. Such abstraction is difficult. This report formalizes these problems and identifies the specific difficulties that must be solved to enable scalable distributed-system management. Keywords: distributed-system observation, partial-order data structure, vector timestamp, data-structure visualization, scalability
Date	January 2001
Report	Issues in Scalable Distributed-System Management (PDF)			Compressed PostScript: Issues in Scalable Distributed-System Management (PS.Z)