1996 Technical Reports | SCS | UW

[Please remove <h1>]


<1995 1997>
CS-96-46
Title Colour and Reflectance in Image Synthesis
Authors T. Pflaum
Abstract In recent years substantial research has been done on physically based image synthesis, which includes the algorithms used to create synthetic images using models and algorithms based on the underlying physics of light. The work presented in this thesis focuses on two particular aspects of image synthesis that have not gained much attention in the past: how to represent colour and how to describe the reflection of light by surfaces.

A simulation system is presented that takes a model of the micro structure of a surface and creates a representation of the bidirectional reflectance function (BRDF) of a surface having that micro structure. In addition, a technique is presented that creates a colour space, used for rendering, that is adapted to the surface colours actually appearing in the scene.
Date December 1996
Report Adobe PDF Compressed PostScript
CS-96-45
Title A Simple, General and Efficient Implementation of Geometric Operators and Predicates
Authors E. Chan and J.N.H. Ng
Abstract Shape and location of objects in a spatial database are commonly represented by geometric data such as points, lines and regions. Numerous geometric operators and predicates have been proposed for spatial database systems. Existing work on their implementation concentrate on individual operators and predicates. This approach makes the realization of geometric operators and predicates in a spatial database system difficult since they are diverse and their implementation in general are complex. In this paper, we present a simple plane-sweep algorithm that can be easily modified to realizeefficiently a set of frequently used line-region and region-region geometric operators and predicates. The design of this unified algorithm is based on the observation that the covering of elementary regions along the sweep line are updated locally and the implementation of these operators and predicates differ only with the output actions at an intersection point. Any geometric operator or predicate, the output of which can be determined by examing incident edges and covering information at intersection pints, can be implemented easily with the algorithm. To demonstrate its generality, extendibility, simplicity and efficiency, we concentrate on several popular geometric operators and predicates. All these operators and predicates can be realized in O((N+1) logN) time in the worst case, where N is the number of edges in the operands and I is the number of intersection pairs. The proposed algorithms is fully implemented in C++ and is tested on a Sun workstation. Although the paper focuses on operators and predicates involving at most two regions, this algorithm can be generalized nicely to r regions, where r>2. We describe what changes are needed to make to the basic algorithm to accommodate this generalization.
Date November 1996
Report PostScript
CS-96-44
Title Querying and Visualization of Geometric Data
Authors E. Chan and J.M.T. Wong
Abstract This paper describes the retrieval, browsing and visualization of data for a spatial database system developed at the University of Waterloo. The prototype, called QL/G, suppports geometric data types like general regions, lines and points as built-in data types. A Graphical User Inter- face (GUI), written in X/Motif and a persistent version of C++, is provided for users as an effective tool for querying and visualizing data retrieved. The interface is simple, user-friendly and flexible, and provides the necessary functionality identified in existing literature for geographic applications.
Date November 1996
Report PostScript
CS-96-43
Title The GUI For A Spatial Database Management System
Authors E. Chan, J.M.T. Wong, P.Y. Abbott and S. Tan
Abstract This paper documents the functionality provided by a spatial database management system (SDBMS) developed at the University of Waterloo.The prototype,called QL/G, supports data types. The functionality is described through the Graphical User Interface (GUI) provided by the system. The user interface component is an especially important part of a SDBMS. In particular, a SDBMS should provide a convenient front-end for definition, entry, visualization and manipulation of spatial data. The system should also address the complex issue of spatial processing and analysis. The GUI of QL/G is designed with the above objective in mind. The interface is simple, user-friendly and flexible, and yetprovides the user with the necessary functionality identified in existing literature for geographic applications.
Date November 1996
Report PostScript
CS-96-42
Title Symbolic Computation Group Annual Report 1996
Authors Symbolic Computation Group
Abstract The Symbolic Computation Group has as its primary goal the research and development of algorithms for computer algebra, including both symbolic computation and hybrid symbolic-numeric computation. The algorithms developed are incorporated into the Maple computer algebra system.

Particular areas of research and development which have been targeted during the past year include symbolic integration, computing solutions of ordinary and partial differential equations (both symbolically and numerically), extended-precision numerical evaluation of special functions, facilities for handling piecewise functions, pattern matching, matrix computations, and tensor computations.
Date November 1996
Report Adobe PDF Compressed PostScript
CS-96-40
Title Grammars++ for Modelling Information in Text
Authors A. Salminen and F.Wm. Tompa
Abstract Grammars provide a convenient means to describe the set of valid strings in a language, and thus they seem natural for describing the set of valid instances in a text database. It is well-known that a given language can be described by many grammars, and similarly text database designers have a choice of grammar for specifying valid documents. This flexibility can be exploited to provide information modelling capability by designing productions in the grammar to represent entities and relationships of interest to the database applications. Additional constraints can be specified by attaching predicates to selected non-terminals in the grammar.

In this paper, we formalize and illustrate the use of extended grammars for text databases. When used for database definition, grammars can provide the functionality that users have come to expect of database schemas. Extended grammars can also be used to specify database manipulation, including query, update, view definition, and index specification.
Date November 1996
Report Adobe PDF Compressed PostScript
CS-96-39
Title The Management of Data, Events, and Information Presentation for Network Management.
Authors M. Hasan
Abstract The purpose of a network management (NM) system is to monitor and control a network. Monitoring and control functions entail deal- ing with large volumes of data, events, and the presentation of relevant information on a management station. The management of data, events, and information presentation pose a major research and development challenge. In this thesis we focus on data and events management and information presentation issues of an NM system. Existing NM systems either use traditional da- tabase systems which are not well suited for a NM system or they lack intelligent event and information presentation management frameworks. None of the systems provide a unified framework for managing data, events and information presentation tasks on an NM station.

We believe that the complexities of network management posed by the issues mentioned above can be reduced substantially by ex- ploiting, enhancing and combining the features of new generation database systems, such as, active temporal and database visu- alization systems. An active database system where active behaviors are specified as Event-Condition-Action (ECA) rules is a suitable framework for NM data and events management. The Hy+ database visualization system with its sophisticated abstrac- tion and visualization capabilities is well-suited to meet the requirements of NM information presentation. By viewing the network as a conceptual global database, the network management functions can be specified as declarative database manipulation operations and Event-Condition-Action (ECA) rules.

But the facilities provided by existing active database systems are not enough for an NM system. For example, NM events manage- ment requires sophisticated systems for event correlation based on domain knowledge on the events and other relationships, such as, structural and temporal relationships between events. A number of existing active temporal database systems provide sup- port for a composite event specification language (CESL) (used in the E part of an ECA rule) that allows one to relate events in the temporal dimension. But these languages lack features that otherwise are required by certain applications.

In this thesis we propose a CESL, called CEDAR that extends the powers of existing languages. CEDAR allows a user to specify various event management functionalities in the NM domain, which otherwise is difficult or impossible to specify in existing languages. An implementation model of the language operators using Colored Petri Nets is also proposed. We also propose a model of a network management database system that incor- porates CEDAR with an active database system, and various other features required by NM functionalities, such as, event defini- tion, sampling or polling, etc. The resulting system is in- tegrated with the Hy+ database visualization system. The deductive capability provided by the Hy+ system further enhances the power of data and events management, and information presen- tation capabilities.
Report Table of Content Adobe PDF Compressed PostScript
CS-96-38
Title Distributed Scientific Data Processing Using the DBC
Authors J. Duff, K. Salem and M. Livny
Abstract The Distributed Batch Controller (DBC) supports scientific batch data processing. The DBC distributes batch jobs to one or more pools of workstations and monitors and controls their execution. The pools themselves may be geographically distributed, and need not be dedicated to processing batch jobs. We describe the use of the DBC in a large scientific data processing application, namely the generation of atmospheric temperature and humidity profiles from satellite data. This application shows that the DBC can be an effective platform for distributed batch processing. It also highlights several design and implementation issues that arise in distributed data processing systems.
Date November 1996
Report Adobe PDF Compressed PostScript
CS-96-36
Title Application of a Stochastic Grammatical Inference Method to Text Structure
Authors M. Young-Lai
Abstract For a document collection where structural elements are identified with markup, it is sometimes necessary to retrospectively construct a grammar that constrains element nesting and ordering. An approach to this problem is described based on a grammatical inference method that generates stochastic finite automata. Improvements are made to the algorithm based on experiments with data from the Oxford English Dictionary. The problems of understanding results and interactively adjusting the algorithm's parameters are also considered.
Date October 1996
Comments This technical report is a reprint of my Master's thesis.
Report Adobe PDF Compressed PostScript
CS-96-35
Title Designing Subdivision Surfaces through Control Mesh Refinement
Authors H. Sheikh
Abstract I present a technique for designing subdivision surfaces through the repeated process of refining the control mesh of the surface. The refinement is performed using the surfaces subdivision algorithm. An abstract structure for subdivision surfaces is developed that involves the control mesh and the subdivision algorithm. This structure is used to build a generic editor that assists in the design of surfaces represented by classes of control meshes and subdivision algorithms.

The theory of subdivision surfaces, in particular, tensor product cardinal splines and box splines is described in detail. From the theory, subdivision algorithms for these surfaces are constructed and then used to create Refiners. The concept of trimming these surfaces is also introduced to aid in rendering of the surface. Several C++ spline classes used in the implementation are described. The subdivision surface editor uses these classes and Open Inventor to provide a prototype 3D surface design, manipulation and editing environment.
Date September 1996
Comments My thesis supervisor was Richard Bartelswho is probably the best person to get in touch with regarding current progress with the thesis.
Report Adobe PDF Compressed PostScript
CS-96-34
Title Using Inverted Lists to Match Structured Text
Authors S. Li
Abstract Inverted files have long been used as an effective way to implement secondary key retrieval in traditional database systems. For information retrieval systems, postings lists use a similar inverted file structure to accommodate the special characteristics of text data.

This thesis explores extensions to these ideas for structured text databases, especially concerning direct containment of structures. Two kinds of data structures are presented, and for each structure, six basic operations are described and analyzed. Two complex forms of query expressions concerning hierarchical and lateral relationships between text structures are also examined. Examples are presented to demonstrate the effect on performance of each data structure.
Date September 1996
Comments This report is a reprint of a thesis that embodies the results of research done in partial fulfillment of the requirements for the degree of Master of Mathematics in Computer Science at the University of Waterloo.
Report Adobe PDF Compressed PostScript
CS-96-33
Title A Method of Program understanding Using Constraint Satisfaction for Software Reverse-Engineering
Authors S. Woods
Abstract The process of understanding a source code in a high-level programming language is a complex cognitive task. The provision of helpful decision aid subsystems would be of great benefit to software maintainers. Given a library of program plan templates, generating a partial understanding of a piece of software source code can be shown to correspond to the construction of mappings between segments of the source code and particular program plans represented in a library of domain source programs (plans). These mappings can be used as part of the larger task of reverse engineering source code, to facilitate many software engineering tasks such as software reuse, and for program maintenance.

We present a novel model of program understanding using constraint satisfaction. The model composes a partial global picture of source program code by transforming knowledge about the problem domain and the program structure into constraints. These constraints facilitate the efficient construction of mappings between code and library knowledge. Under this representational framework, earlier heuristic approaches to program understanding may be unified, contrasted, and compared.

We describe an empirical study which demonstrates the feasibility of our model in the program understanding subtask of partial local explanation. In addition, we incorporate this local model into the larger task of combining these local explanations into a coherent global picture of the code. While many heuristic global models are possible, we describe an encompassing structure and demonstrate, through a careful depiction of the algorithm and several domain examples, how constraint satisfaction offers a rich paradigm for the larger task of reverse engineering source code, to exploiting both library and program structural constraint information. One primary advantage of the constraint satisfaction paradigm (CSP) is its generality; many previous program understanding efforts can be more easily compared. Another advantage is the improvement in search efficiency using various heuristic techniques in CSP.
Date September 1996
Report Adobe PDF Compressed PostScript
CS-96-32
Title World Space User Interface for Surface Pasting
Authors L.K.Y. Chan
Abstract Surface pasting is a composition method that applies B-spline surfaces, called features, to any number of other B-spline surfaces, called base surfaces, in order to add details on the base surfaces. The locations as well as the size of the features are determined by transformations of feature domains. By modifying the domain layout of pasted surfaces, we can manipulate the appearances of features interactively in a Domain Space User Interface. However, this domain user interface is inadequate because the user cannot interact with the three-dimensional model directly. In this thesis, I propose a World Space User Interface that attempts to map three-dimensional user actions to two-dimensional domain operations and aims at providing a more intuitive and efficient modeling interface for surface pasting.
Date September 1996
Report Adobe PDF Compressed PostScript
CS-96-31
Title Query Based Stemming
Authors E. Tudhope
Abstract In information retrieval the relevancy of a document to a particular query is based on a comparison of the terms appearing in the query with the terms appearing in the document. Morphological variants of words (i.e. locate, locates, located, locating) often carry the same or similar meaning. Such terms should be considered equivalent for information retrieval purposes. Stemming is a simple application of natural language processing that is commonly applied at index time to reduce morphological variants to a common root form. This report first examines several approaches to stemming. Some of the problems associated with the use of stemming as a query expansion technique are then discussed. The construction of equivalence classes of words for stemming at query time is presented as a possible alternative method to address some of these problems.
Date September 1996
Report Appendix Adobe PDF Compressed PostScript
CS-96-30
Title Finding the Loneliest Point
Authors K.Y.Yeung
Abstract We study the following problem which finds application in solving equations. Given a Euclidean domain, there are two operations, INSERT and ISOLATE. INSERT adds points to the domain. ISOLATE returns a point in the domain which is satisfactorily far from the points already inserted in the domain.

The objective is to perform INSERT and ISOLATE in constant amortized time per operation, linear space and a constant worst case ratio. Worst case ratio measures how far the answer from ISOLATE differs from the best possible answer. We concentrate on the case with a two dimensional domain.

The basic approach is to divide the domain into geometric shapes and build a hierarchy of that shape. We have studied the three regular polygons that tile a given two dimensional domain: squares, hexagons and triangles. It turns out that the worst case ratio is smaller when the domain is divided into hexagons than when the domain is divided into squares, the worst case ratio for squares is in turn smaller than that of the triangles.

We have also explored other techniques to improve the worst case ratio while keeping constant amortized running time per operation and linear space. We have explored overlapping, embedding (or diamonds), and multiple grids. The worst case ratio can be improved further when these techniques are combined.

We have also shown that the square technique can be extended to cubes in a three dimensional domain.
Date August 1996
Report Adobe PDF Compressed PostScript
CS-96-29
Title Algorithms from Blossoms
Authors S. Mann
Abstract Blossoming is a theoretical technique that been used to develop new CAGD theory. However, once we have developed this theory, we need to devise algorithms to implement it. In this paper, I will discuss converting blossoming equations into code, noting techniques that can be used to develop efficient algorithms. These techniques are illustrated by considering the operations of basis conversion and polynomial composition.
Date October 1996
Report Adobe PDF Compressed PostScript
CS-96-28
Title Robust Numberical Methods for PDE Models of Asian Options
Authors R. Zvan, P.A. Forsyth and K. Vetzal
Abstract We explore the pricing of Asian options by numerically solving the associated partial differential equations. We demonstrate that numerical PDE techniques commonly used in finance for standard options are inaccurate in the case of Asian options and illustrate modifications which alleviate this problem. In particular, the usual methods generally produce solutions containing spurious oscillations. We adapt flux limiting techniques originally developed in the field of computational fluid dynamics in order to rapidly obtain accurate solutions. We show that flux limiting methods are total variation diminishing (and hence fre eof spurious oscillations) for non-conservative PDEs such as those typically encountered in finance, for fully explicit, and fully and partically implicit schemes. We also modify the van Leer flux limiter so tht the second-order variation diminishing property is preserved for non-uniform grid spacing.
Date September 1996
Report Adobe PDF Compressed PostScript
CS-96-27
Title Aposteriori Error Estimation for CFD
Authors B. van Straalen
Abstract A technique for numerically estimating the discretization error in upwind based finite volume fluid flow simulation was developed. The technique is based on residual estimation, followed by solving the global error equation over the computational domain. One and two dimensional analysis of the error estimation process was performed for a simple homogeneous advection-diffusion equation. The technique was then extended to encompass the laminar Navier-Stokes equations. The effectiveness of the technique was investigated for flow over a two dimensional backward-facing step. The results show promise for future implementations of effective and practical error estimation.
Date July 1996
Report Adobe PDF Compressed PostScript
CS-96-25
Title On Line Target Searching in Bounded and Unbounded Domains
Authors A. Lopez-Ortiz
Abstract In this work we study the problem of a robot searching for a target on bounded and unbounded domains, specifically in one and two dimensions. We present some relevant results in the field, and within this framework, the contributions of this work. We show that one-sided searches on the real line have an average competitive ratio of 9 and we apply this result for the case of known destination searches on G-street polygons. For this class of polygons we also show that the best known upper bound of sqrt(82) is indeed optimal for the case of unknown destination searches. For orthogonal street polygons we prove that knowing the location of the destination does not reduce the competitive ratio.

We present the first robust algorithm for searches inside street polygons under navigational errors, as well as for other important classes of impaired robots such as robots lacking triangulation and depth measurement mechanisms. In particular, we propose a \pi+1-competitive strategy which is robust under navigational error, and equally efficient for oblivious robots. We also present an 1.92-competitive strategy which does not require depth perception from the robot.

We also give the best-known strategy for searching inside street polygons, at a competitive ratio of 1.76. This significantly improves over the best previously known ratio of 2.8.

We give the first non-trivial lower bound for searching and walking into the kernel of a polygon. We provide upper and lower bounds for searching for a target inside street polygons. These results are the first constant competitive ratio strategies inside a class of polygons which is independent of the location of the start and target points.

We also prove negative results regarding the optimality of several well-known family of strategies for searching inside simple polygons. In particular we show that Kleinberg's strategy is 2.6-competitive, and that continuous bisector, which is in many respects a good strategy, is at least 1.68-competitive. Lastly we show that any strategy which does not respond to the absence of new extreme points has a competitive ratio strictly larger than sqrt(2).
Date July 1996
Report Adobe PDF Compressed PostScript
CS-96-24
Title Double-Blind Scores of an Object-Oriented Modeling Survey
Authors R.S.C. Yiu
Abstract
Date June 1996
Report Adobe PDF Compressed PostScript
CS-96-23
Title Optimal adaptive polygonal approximation of parametric surfaces
Authors L. Velho and L.H. de Figueiredo
Date July 1996
Report Only available in printed format
CS-96-22
Title Spline-Based Tools for Conventional and Reflective Image Reproduction
Authors I.E. Bell
Date May 1996
Report Only available in printed format
CS-96-21
Title Length of Finite Pierce Series:
Theoretical Analysis and Numerical Computations
Authors V. Keselj
Abstract
Any real number  x  from the interval  (0,1]  can be represented as
  a unique Pierce series

     1      1        1
    -- - --- + ---- - ...
     q1   q1*q2   q1*q2*q3

  where  {q1, q2, q3, ...}  is an increasing sequence of positive
  integers. The series is finite if and only if the number  x  is
  rational. This paper discusses the length of the series and the
  final results are a new upper bound (Theorem 2) and a new lower
  bound (Theorem 3) on the length.

  The numerical computations concerning the length are done by
  computer and the algorithms used and results are presented. The
  numerical results are an extension to the tables previously
  published.
Date September 1996
Report Adobe PDF Compressed PostScript
CS-96-20
Title The Empirical Derivation of a Design Space and Design Rules for Software Architecture
Authors K. Reddy
Abstract The empirical derivation of a design space and design rules for software architecture is presented. The research presented in this paper represents and effort to begin capturing what is known by experienced designers about designing for non-functional qualities, such as performance, portability, reusability etc., in a rigorous, statistically validated fashion. The design space and design rules are codified by synthesizing information collected by the administration of a survey to experienced designers of large scale systems. They capture the relationships between six unit operations which are structure transforming mechanisms ubiquitous in software design, and eleven non-functional qualities. Their use enables design for non-functional qualities to occur at an earlier development stage than current practice. The creation, administration and analysis of the survey is presented along with the resultant design space and design rules.
Date May 1996
Report Only available in printed format
CS-96-18
Title Data Transfer Using Controlled Compression
Authors A.Y.D. Cheung
Abstract In a scenario in which a large amount of data is to be transferred from one site to another, it is desirable to optimize the throughput of data transfer. There are two transmission options. One option is to compress the data before the transfer, and to decompress them at the receiving site. The other option is to send the data without doing any compression. Ideally, we would like to choose the option that accomplishes the transfer as quickly as possible. This choice is complicated by many factors such as the load of the machines, and the degree of traffic congestion of the network. These factors affect the transmission rate differently depending on whether or not we use compression. This thesis proposes and evaluates two algorithms for automatically determining whether compression should be used. One algorithm is based on sampling past performance with and without compression. The other directly exploits feedback from the network. These algorithms have been implemented and evaluated under different conditions. The evaluation shows that both algorithms can pick the correct mode of data transfer to use in different environments. The algorithm based on network feedback is more complicated and requires tuning to perform well.
Date April 1996
Report Adobe PDF Compressed PostScript
CS-96-17
Title The DBC: Processing Scientific Data Over the Internet
Authors C. Chen, K. Salem and M. Livny
Abstract We present the Distributed Batch Controller (DBC), a system built to support batch processing of large scientific datasets. The DBC implements a federation of autonomous workstation pools, which may be widely-distributed. Individual batch jobs are executed using idle workstations in these pools. Input data are staged to the pool before processing begins. We describe the architecture and implementation of the DBC, and present the results of experiments in which it is used to perform image compression.
Date March 1996
Report Adobe PDF Compressed PostScript
CS-96-15
Title Overview of GSM: The Global System for Mobile Communications
Authors J. Scourias
Date March 1996
Report Adobe PDF Compressed PostScript
CS-96-14
Title A Normal Form for Function Rings of Piecewise Functions
Authors M.von Mohrenschildt
Abstract Computer algebra systems often have to deal with piecewise defined functions, for example, the absolute value function. We present a new approach to this kind of problem. This paper provides a normal form for function rings containing piecewise functions. We give a compiled rule system to compute the normal form of a function. With a normal form, we can decide equality in our function ring. In this ring, we can define supremum and infimum of two functions. In fact we show that the function ring is a compiled lattice. We present a method to solve equalities and inequalities in this function ring.
Date March 1996
Comments This paper has been submitted to ISSAC'96.
Report Adobe PDF Compressed PostScript
CS-96-13
Title Some improvements of a lemma of Rosenfeld
Authors F. Boulier
Abstract We give two improvements of a lemma of Rosenfeld which permit us to optimize some algorithms in differential algebra: we prove the lemma with stronger hypotheses and we demonstrate an analogue of Buchberger's second criterion, which avoids non necessary reductions for detecting coherent sets of differential polynomials. We try also to clarify the relations between the theorems in differential algebra and some more widely known results in the Groebner bases theory.

Keywords:
Differential algebra. Rewrite systems. Polynomial differential equations. Rosenfeld's lemma.
Date April 1996
Comments Submitted to IMACS'96
Report Adobe PDF
CS-96-12
Title A Multicollaborative Push-Caching HTTP Protocol for the WWW
Authors Alex Lopez-Ortiz
Abstract We propose a caching protocol designed to automatically mirror heavily accessed WWW pages in a distributed and temporal fashion. The proposed caching mechanism differs from proxy type mechanisms in that it caches according to load pattern at the server side, instead of access patterns at the client-side LAN, in a Demand-based Document Dissemination (DDD) system fashion. This type of server initiated caching scheme has been termed push-caching. As well, the proposed caching scheme incorporates topological caching functions. The proposed protocol is orthogonal to other extensions to the HTTP protocol and other caching schemes already in use.
Date October 1996
Report Adobe PDF Compressed PostScript
CS-96-09
Title Tabular Abstraction, Editing and Formatting
Authors X. Yang
Abstract This dissertation investigates the composition of high-quality tables with the use of electronic tools. A generic model is designed to support the different stages of tabular composition, including the editing of logical structure, the specification of layout structure, and the formatting of concrete tables. The model separates table's logical structure from its layout structure, which consists of tabular topology and typographic style. The notion of an abstract table, which describes the logical relationships among tabular items, is formally defined and a set of logical operations is proposed to manipulate tables based on these logical relationships. An abstract table can be visualized through a layout structure specified by a set of topological rules, which determine the relative placement of t abular items in two dimensions, and a set of style rules, which determine the final appearance of different items. The absolute placement of a concrete table can be automatically generated by applying a layout specification to an abstract table. An NP-complete problem arises in the formatting process that uses automatic line breaking and determines the physical dimension of a table to satisfy user-specified size constraints. An algorithm has been designed to solve the formatting problem in polynomial time for typical tables. Based on the tabular model, a prototype tabular composition system has been implemented in a UNIX, X Windows environment. This prototype provides an interactive interface to edit the logical structure, the topology and the styles of tables. It allows us to manipulate tables based on the logical relationships of tabular items, regardless of where the items are placed in the layout structure, and is capable of presenting a table in different topologies and styles so that we can select a high-quality layout structure.
Date January 1996
Report Adobe PDF Compressed PostScript
CS-96-07
Title The Use of a Combined Text/Relational Database System to Support Document Management
Authors K.Y. Ng
Abstract

In this thesis, we study the problem of representing and manipulating a document to facilitate browsing, editing, string/content searches and document assembly.

Two major data models in which documents are represented and stored are:

  1. a relational data model, where all text contents in a document are represented in relations, each with several attributes, or
  2. a text data model, where documents are represented as contiguous characters, typically interspersed with tags to capture their various logical, semantic, and presentational features and relationships.

    Each approach has its own strengths and limitations. In our work, we study how a hybrid system based on a combined text/relational model can support document management. We describe database design trade-offs involving the appropriate placement of information in the text and relational database components. With an appropriate design, the advantages of both models can be exploited, while the shortcomings of using them individually are diminished.

    We propose a set of primitive operations and a methodolgy for using them to evaluate the various alternatives for data placement. The methodology consists of simulating pre-defined, representative, document management tasks using the primitive operations and studying the numbers, types, and the time performance of the operations involved. Using some representative document management tasks as examples, we demonstrate the use of the methodology and the primitive operations to study and compare the processing of the tasks in the various data models mentioned above.
Date January 1996
Report Adobe PDF Compressed PostScript
CS-96-04
Title Asymptotically Fast Computation of Hermite Normal Forms of Integer Matrices
Authors A. Storjohann and G. Labahn
Abstract This paper presents a new algorithm for computing the Hermite normal form H of an n x m rank m integral matrix A together with a unimodular pre-multiplier matrix U such that UA=H . Our algorithm requires O~(m^(\theta-1) n M(m log ||A||)) bit operations to produce both H and a candidate for U . Here, ||A|| =\max_{ij} |A_{ij}| , M(t) bit operations are sufficient to multiply two t-bit integers, and \theta is the exponent for matrix multiplication over rings: two m x m matrices over a ring R can be multiplied in O(m^\theta) ring operations from R . The previously fastest algorithm of Hafner & McCurley requires O~(m^2 n M(m log ||A||)) bit operations to produce H , but does not produce a candidate for U . Previous methods require on the order of O~(n^3 M(m log ||A||)) bit operations to produce a candidate for U -- our algorithm improves on this significantly in both a theoretical and practical sense.
Date January 1996
Comments This paper has been submitted to ISSAC'96.
Report Adobe PDF Compressed PostScript
CS-96-03
Title Near Optimal Algorithms for Computing Smith Normal Forms of Integer Matrices
Authors A. Storjohann
Abstract We present new algorithms for computing Smith normal forms of matrices over the integers and over the integers modulo N . For the case of matrices over Z/(N) , we present an algorithm that computes the Smith form S of an n x m matrix A over Z/(N) in only O(n^\theta m) operations from Z/(N) . Here, \theta is the exponent for matrix multiplication over rings: two n x n matrices over a ring R can be multiplied in O(n^\theta) operations from R . We apply our algorithm for matrices over Z/(N) to get an algorithm for computing the Smith form S of an n x m matrix A over Z in O~(n^(\theta-1) m M(n log ||A||)) bit operations (where ||A|| = max |A_{i,j}| and M(t) bounds the cost of multiplying two t-bit integers). These complexity results improve significantly on the complexity of previously best known Smith form algorithms (both deterministic and probabilistic) which guarantee correctness.
Date January 1996
Comments This paper has been submitted to ISSAC'96.
Report Adobe PDF Compressed PostScript
<1995 1997>