From Data Independence to Ontology Based Data Access (and back)
Among the most commonly cited features of the ontology based data access (OBDA) approach to accessing data sources is its ability to use a high-level user-friendly interface to a conceptual understanding of the data (aka ontologies), while still utilizing low-level but efficient ways of representing the data in a computer store. The aim of this tutorial is to compare and contrast this OBDA based approach with approaches centered around the concept of data independence that has been under development in the area of database systems since the early 1970s. The tutorial focuses on the common lessons shared by all approaches, and on how each can benefit from lessons learned from the other.
|Location and Time:||TBA, IJCAI 2020, Yokohama, Japan.|
Accessing information using a high-level data model or ontology has been a long-standing objective of research communities in several areas. In work based on knowledge representation in artificial intelligence (AI), this objective commonly falls under the heading of OBDA and of ontology mediated querying (OMQ), and has fostered the development of approaches using query rewriting or using variants of the so-called combined approach. However, the underlying idea of separating an ontological view of how information must be understood by users from a physical view of the layout of data in data structures-called data independence-has been the focus of work in the area of information systems for more than fifty years. This tutorial explores how the original idea of data independence evolved and ultimately culminated in logic-based approaches to information management by systems that has enabled high-level ontological views of information entirely devoid of any low-level physical views of concrete data layout. An integral part of the tutorial is to explore the relationship between such high-level ontologies that users see and an understanding of the physical representation of such information in computer systems that is necessary to attain acceptable performance. The tutorial will address the latter by showing how ontologies derived by ontology design in AI can be used in a way that achieves an understanding of physical encoding of information sufficiently fine grained to ensure the performance of code ultimately executed to satisfy users' information requests can be competitive with solutions hand-written in low-level programming languages such as C.
- Ontologies, Logical Theories, and Data Independence. We start by introducing the idea of data
independence itself and show how logic-based AI technologies can be used to formally capture this
idea. We also survey the key developments and barriers to full adoption of the idea in information systems;
- Physical Design as Logical Design. We continue with representative examples of what can be achieved by
full adoption of this idea. We focus on the link between the conceptual/logical understanding of the
information and its physical representation in a computer systems (called physical design). We also
show how knowledge representation can be used to account for various intricacies of a physical design;
- Supporting Technology. We discuss AI technologies needed to make the idea of data independence viable
in practice, focusing on issues relating to generation of efficient code that can be subsequently integrated
in applications and information systems;
- Open Problems. We conclude the tutorial with an outline of directions for further research, and with
a list of open issues related to physical data independence in ontology-based information systems.
Audience and Background
The topics covered in the tutorial are of interest to wide range of AI researchers and to members of the general public with an interest in knowledge representation. In particular, the tutorial targets the following groups:
- Undergraduate and graduate students and junior researchers: the tutorial introduces this group to state-of-the-art approaches to addressing issues connected with representation, storage, and manipulation of information and to modern techniques that address these issues;
- Researchers in the area of knowledge representation and other areas of AI: the tutorial provides bridges to many areas of AI where large data sets are used, ranging from approaches to knowledge representation and, in particular, implementation of such systems, to managing information for semantic WEB systems;
- Industry practitioners and developers: the tutorial provides ideas how development of software systems, in particular in the critical phase of conceptual modelling and its mapping to physical computer storage, can be improved and what tools are available to aid this goal;
- Members of the general public, with an interest in logical underpinnings of logic-based information management and in technologies based on these ideas.
Relevance to IJCAI 2010
The tutorial focuses on foundational issues relating to representation of information in computer systems, including knowledge bases, ontologies, and information systems based on ontologies, and on how issues relating both to user appreciation of the information and to effacing use of the underlying computing infrastructure can be comprehensively addressed. Since every design of an information system faces decisions relating to how external entities will be represented within such a system (in addition to representing various properties of such entities), a general approach to this problem is of interest to ontology developers/engineers and data scientists. Interestingly, the approach to the representation and storage of information discussed in the tutorial naturally and seamlessly complements standard approaches in conceptual and ontology design methodologies. The tutorial is thus of interest both to researchers in knowledge representation and to practitioners in the wide area of information management.
About the Authors
Dr. David Toman and Dr. Grant Weddell are professors of Computer Science at the University of Waterloo, Canada. They have published and presented results in the area of knowledge representation over the last 20 years at premier AI conferences (including a Reiter Prize at KR 2010 and Best Paper Prize at ISWC 2013); Dr. Toman has also given tutorials in the area of temporal representation and reasoning and temporal databases and information systems that has led to an invited chapter in the Handbook of Temporal Reasoning in Artificial Intelligence.
Presenters' Background in the Area of the Tutorial
The authors have longstanding interest in the area of the tutorial, and have published a monograph on Fundamentals of Physical Design and Query Compilation on this topic. They are also experts on OBDA/OMQ approaches to query answering in knowledge representation systems and were awarded (with coauthors) the Ray Reiter Prize in 2010 for their work on the combined approach to OBDA at KR 2010 and later the Best Paper Prize at ISWC13. They are authors of many other papers on this topic and have been developing an experimental system that validates the general approach to data independence discussed in the tutorial.
The authors have recently presented tutorials on the topic of referring expressions in knowledge representation and information systems, based on results developed together with Alexander Borgida (Rutgers) for which they were awarded the Ray Reiter Best Paper prize at KR 2016. Subsequently, with their coauthors, they were awarded the 2018 Bob Wielinga Best Paper Award for the paper furthering the use of referring expressions in conceptual modelling. The tutorials were as follows:
- Referring Expressions in Ontologies and Query Answering at the 10th International Conference on Formal Ontology in Information Systems, FOIS 2018 (in Cape Town, South Africa, September 2018), and
- Managing and Communicating Object Identities in Knowledge Representation and Information Systems at the 31st Australasian Joint Conference on Artificial Intelligence AI 2018 (in Wellington, New Zealand, December 2018).
- Referring Expressions in Knowledge Representation Systems at the 28th International Joint Conference on Artificial Intelligence (in Macao, SAR China, August 2019).
|Name:||David Toman and Grant Weddell|
|Affiliation:||Cheriton School of Computer Science, University of Waterloo|
|Address:||200 University Ave W., Waterloo, ON N2L3G1, Canada|
Resources and Bibliography
- Tutorial Slides: