For a list of students supervised, refer to Students.
See also Publications.

Overview of research program

The long-range objective of my research has been (and will continue to be) to develop a unified methodology for designing data structures from the individual users' models through the enterprise model to the storage structures. This has involved the development of formal models, the development and analysis of effective algorithms, and the application of the ideas to solving large, practical problems.

We pursued this objective first as it applies to conventional record-oriented databases. Doctoral students working with me concentrated on defining and analyzing properties of "normal forms" as part of the design of a conceptual model [Osborn PhD:77, Ling PhD:78]. Collaborating in part with Professor Gaston Gonnet, we examined formalisms for describing data structures to support efficient algorithms [Tompa 1977, Gonnet-Tompa 1983], concentrating particularly on the specification of their abstract structures [Santoro PhD:79, Tompa 1980] and on the design of efficient storage structures and policies [Ramirez PhD:80, Ziviani-Tompa 1982]. More recently, students working with me have concentrated on supporting user models of the data: examining the problems of processing database updates that are expressed in terms of a partial view of the data [Medeiros PhD:85, Brodnik-Tompa 1993] and of keeping a materialized view up-to-date in the presence of change to the underlying stored data [Blakeley PhD:87]. We have also examined algorithms to process users' queries by the most efficient means available [Icaza PhD:87].

Since 1981, we have examined database concerns for non-standard databases, first concentrating on videotex databases. Because the fundamental assumptions about the nature of the data and its uses distinguishes videotex databases from conventional ones, we developed a page-oriented database model that includes query and update facilities [Tanguay MMath:86]. Because of videotex's orientation towards large public-access systems for casual users, several students working with me considered the support of individualized views. We investigated powerful browsing facilities [Raymond MMath:84], graphical query languages [Böggild MMath:86], and the effectiveness of users' classification systems [Raymond-Cañas-Tompa-Safayeni 1989]. Although declining interest in videotex reduced the impact of our work, it served well as a precursor for ongoing work with hypertexts [Tompa 1990, Tompa-Raymond 1991, Tompa-Blake-Raymond 1993].

Since 1984, we have concentrated on more general text-dominated databases. The thrust of this research has been directed towards the development of a database system that will be capable of supporting the needs of text creators (such as the editors at the Oxford University Press who will maintain and enhance the Oxford English Dictionary), while simultaneously supporting the needs of text users (writers, editors, humanist scholars) who will access such machine-readable texts interactively. Again the conventional assumptions about data were found to be inappropriate — even the fundamental concept of ``entity'' does not apply — and, in close collaboration with Professor Gaston Gonnet, we developed two new models of text data [Blake-Bray-Tompa 1992, Salminen-Tompa 1994]. Because of the great potential, an Ontario company, Open Text Corporation, began operations in July 1989 to develop and market products based on our research. Open Text, which currently employs over 12,000 individuals worldwide, has developed the Livelink Intranet application suite including Livelink Search, which has evolved from our research.

Later research includes the design and development of a text/relational database management system, based on a federated model that provides a hybrid query processor that supports extensions to SQL to accommodate structured text such as described using SGML, and the design and development of a system for data resource discovery for deployment on the internet.

The following list of major collaborations is indicative of the value of the research to industry:

Business Intelligence Network, Professor Renée Miller and 14 other Canadian professors, NSERC Strategic Network, April 2009 - March 2014, Industrial partners: IBM Canada and SAP.
Records Identification and Retention in Relational Database Management Systems, Industrial grant, January 2011 - June 2011, Industrial partner: RIM.
Rolling Back Databases, Industrial grant, September 2005 - August 2009, Industrial partner: Open Text Corporation.
Access Control for XML and Object-Oriented Databases, with Professor Kenneth Salem, CITO grant, October 2003 - September 2005, Industrial partner: Open Text Corporation.
Managing Large, Diverse Data Resources, with ten other UW professors, Bell Canada Universities Labs grant, May 2000 - April 2002, Industrial partner: Bell Canada.
Resource Discovery in Web Databases, with Dr. Mariano Consens, ITRC grant, January 1996 - December 1997.
Investigation of Text/Relational DBMS, with Professor Paul Larson, Strategic Technologies grant, April 1993 - March 1997, Industrial partner: Open Text Corporation (as part of the Canadian Strategic Sortware Consortium (CSSC), supported in part by Industry Canada and including also Fulcrum Technologies, InContext Systems, Megalith Technologies, Public Service Systems, SoftQuad, and Systèmes Grafnetix).
Flexible Text Visualization with Applications to CASE, NSERC Industrially Oriented Research grant, August 1991 - July 1994, Industrial partner: IBM Canada Laboratory.
Text-Dominated Databases, with Professors Gaston Gonnet, Ian Munro, and Sylvia Osborn, NSERC Co-operative Research and Development grant, November 1986 - October 1989, Industrial partner: Oxford University Press
Data Structuring for Page-oriented Database Systems, with Professors Gaston Gonnet, Paul Larson, and Ian Munro, NSERC Strategic grant, November 1983 - October 1986.