You are here

Frank Tompa

Research Interests

Professor Tompa's teaching and research interests are in the fields of data structures and databases, and he has been an active member of the Database Research Group and the Information Retrieval Group since their founding. His current interests are in text-dominated database applications, where the content is primarily long, semi-structured strings, which are not as well supported by traditional database systems as are numbers. The challenge is to discover how the complexity of text, with its intricate structure and diversity of expression, can be efficiently managed.

Professor Tompa's approach to text database research starts by seeking real applications' needs. He is particularly interested in problems that can be addressed through (1) improvements to data models and languages for structured text, (2) design and implementation of text transformation systems, and (3) design and implementation of improved algorithms and storage structures for document searching and browsing. With his graduate students and colleagues, he engages in research that addresses the following questions in order to provide efficient data access, maintenance, and delivery: How can an application's information needs be specified, whether as a single query or as an iterative refinement that may include browsing? How can the specification of needs be converted into searches over the structure and content of heterogeneous data? What storage and indexing techniques best balance the applications' requirements for query, update, and mining? How much data need be transmitted back to the application in response to a query? How much data transformation (from its stored form to a form suitable for the application) should be done at the site(s) of the repository and how much at the application site(s)? Under what conditions should partially transformed data be cached (and perhaps updated) in the repository in order to support better access for data retrieval or data mining?

Degrees and Awards

ScB, ScM (Brown), PhD (Toronto), LLD (Dalhousie)

Distinguished Professor Emeritus, University of Waterloo (2014); Queen Elizabeth II Diamond Jubilee Medal, Government of Canada (2012); ACM Fellow (2010); David R. Cheriton Faculty Fellow, University of Waterloo (2007-2010); Street named Frank Tompa Drive, University of Waterloo/City of Waterloo (2005); Award of Excellence in Graduate Supervision, University of Waterloo (2005); Faculty of Mathematics Fellow, University of Waterloo (2000-2003); University-Industry Synergy Award/with Open Text et al. (1997); Named NSERC Leader in Canadian Science (1991).

Industrial and Sabbatical Experience

As part of a joint venture with the Oxford University Press, Professor Tompa was a co-leader of the University of Waterloo's project to design an on-line dictionary database that is suitable for editors charged with maintaining the Oxford English Dictionary (OED), lexicographers working on other dictionaries, and researchers who wish to consult the OED interactively. The software developed during this project not only addressed the needs related to the OED, but it also proved to be of sufficient benefit in other commercial applications that it could form the basis of Open Text Corporation, a spin-off company of which Professor Tompa was a co-founder and initial member of the Board of Directors. In addition, he was named a “leader in Canadian Science” by the Natural Sciences and Engineering Research Council of Canada (NSERC) in their first publication of twelve Great Canadian Success Stories in 1991.

From 1993 through 1997, Professor Tompa served as a principal investigator on an industrially motivated project to extend SQL to accommodate structured text. This research was conducted in close collaboration with Open Text Corporation and other members of the Canadian Strategic Software Consortium (CSSC), supported in part by Industry Canada and including Fulcrum Technologies, Grafnetix Systems, InContext Systems, Megalith Technologies, Public Service Systems, and SoftQuad. The collaboration resulted in receiving a 1997 University-Industry Synergy R&D Partnership Award from the Conference Board of Canada and NSERC.

Other industrial research collaborations have included Bellcore (where Professor Tompa was a member of technical staff during his sabbatical leave in 1987-88), IBM Toronto Lab, Bell University Laboratories, and Microsoft Research (where he was a visiting researcher during his sabbatical leave in 2007).

Representative Publications

A. R. Kane and F. W. Tompa. Skewed Partial Bitvectors for List Intersection. Proc. of ACM SIGIR: 263-272, 2014.

S. Kamali and F. W. Tompa. Retrieving Documents with Mathematical Content. Proc. of ACM SIGIR: 353-362, 2013.

A. A. Ataullah and F. W. Tompa. Business Policy Modeling and Enforcement in Databases. Proc. of Very Large Data Bases (PVLDB), 4(11): 921-931 (2011).

D. E. DeHaan and F. W. Tompa. Optimal Top-Down Join Enumeration. Proc. of ACM SIGMOD:785-796, 2007.

R. H. Warren and F. W. Tompa. Multi-column Substring Matching for Database Schema Translation. Proc. of Very Large Data Bases (VLDB):331-342, 2006.

H. Zhang and F. W. Tompa. XQuery Rewriting at the Algebraic Level. In “Trends in XML Technology for the Global Information Infrastructure,” a special issue of Journal of Computer Systems, Science, and Engineering, 18(5):241-262, 2003.

M. Young-Lai and F. W. Tompa. Stochastic Grammatical Inference of Text Database Structure. Machine Learning, 40:111-137, 2000.

Contact Information


                Frank Tompa