DISIMA Project Refereed Publications

V. Oria, M. T. Özsu, and P.J. Iglinski, "Foundation of the DISIMA Image Query Languages", Multimedia Tools & Applications Journal, 23: 185-201, 2004.

Abstract: Because digital images are not meaningful by themselves, images are often coupled with some descriptive or qualitative data in an image database. These data also divided into syntactic (color, shape, and texture) and semantic (meaningful real word object or concept) features, necessitate novel querying techniques. Most image systems and prototypes have focussed on similarity searches based upon the syntactic features. In the DISIMA system, we proposed an object-oriented image data model that introduces two main types: image and salient object. We further defined operations on the images and the salient objects as new joins. This approach is necessary in order to envision a declarative query language for images. This paper summarizes the querying facilities implemented for the DISIMA system and gives their theoretical foundation: the data model and the complementary algebraic operations, the textual query language (MOQL) and its visual counterpart (Visual- MOQL) based on an image calculus. Both languages are declarative and allow the combination of semantic and similarity queries.

V. Oria and M. T. Özsu, "Views or Points of View on Images," International Journal of Image and Graphics, 3(1): 55-80, 2003.

Abstract: Images like other multimedia data need to be described as it is difficult to grasp their semantics from the raw data. With the emergence of standards like MPEG-7, multimedia data will be more and more produced together some semantic descriptors. But a description of a multimedia data is just an interpretation, a point of view on the data and different interpretations can exist for the same multimedia data. In this paper we explore the use of view techniques to defines and manage different points of view on images. Views have been widely used in relational database management systems to extend modeling capabilities, and to provide logical data independence. Since our image model is defined on an object-oriented model, we first propose a powerful object-oriented mechanism based on the distinction between class and type. The object view is used in the image view definition. The image view mechanism exploits the separation of the physical representation in an image of a real world object from the real object itself to allow different interpretations of an image region. Finally we discuss the implementation of the image view mechanisms on existing object models.

V. Oria, M. T. Özsu, L. I. Cheng, P.J. Iglinski and Y. Leontiev, "Modeling Shapes in an Image Database System". In Proceedings of the 5th International Workshop on Multimedia Information System , Indian Wells, Palm Springs Desert, California, USA, October 1999, pages 34-40.

Abstract: Due to the complex modeling requirement of data handled by image and spatial databases, they are most often built on top of object-oriented or object-relational databases. In the DISIMA image database, an image is composed of salient objects and a salient object has a shape which is a geometric object. The object-oriented modeling of shapes potentially conflicts with the mathematical definitions of geometric objects. Mathematically, a triangle and a rectangle are polygons and a square is a special kind of rectangle. Accordingly, a class Triangle should be a subclass of the class Polygon. In the same way, a class Square should be a subclass of Rectangle which, in turn, should be defined as a subclass of Polygon. But from the point of view of data representation, this leads to a conflict: A polygon minimally requires a list of n consecutive points for its description, whereas a rectangle can be defined by just three points and a square by just two points, if we take advantage of their symmetry. This paper proposes an object-oriented modeling of shapes that accords with their mathematical definitions, optimizes their data representations, and lends power for shape similarity queries.

V. Oria, M. T. Özsu, B. Xu, L. I. Cheng and P.J. Iglinski, ``VisualMOQL: The DISIMA Visual Query Language''. In Proceedings of the 6th IEEE International Conference on Multimedia Computing and Systems, Florence, Italy, June 1999, volume 1, pages 536-542.

Abstract: Multimedia data are now available to a variety of users ranging from naive to sophisticated. To make querying easy, visual query languages have been proposed. Most of these languages have a low expressive power and have their own query processors. Efforts have been made to design query languages with proper semantics to facilitate query optimization and processing in existing database systems. The majority of multimedia database systems are built on top of object or object-relational database systems with the underlying query facilities inherited. The DISIMA system is being built on top of a commercial OODBMS and we have chosen to extend the standard object-oriented query language OQL with some multimedia functionalities. The resulting language is called MOQL. This paper presents VisualMOQL a visual query language implementing the image component of MOQL.

Y. Niu, M.T. Özsu and X. Li. "2-D-S Tree: An Index Structure for Content-Based Retrieval of Images". In Proc. Multimedia Computing and Networking, San Jose, California, 25-29 January 1999, pages 110-121.

Abstract:An important feature to be considered in the design of multimedia DBMSs is content-based retrieval of images. Most work in this area has focused on feature-based retrieval; we focus on retrieval based on spatial relationship, which include directional and topological relationships. The most common data structure that is used for representing directional relations is the 2-D string. The search process, however, is sequential and the technique does not scale up for large databases. We propose a new indexing structure, the 2-D-S-tree, to organize 2-D strings for query efficiency. The 2-D-S-tree is completely dynamic; inserts and deletes can be intermixed with searches and no periodic reorganization is required. A performance analysis is conducted, and both analytical analysis and experimental results indicate that the 2-D-S-tree is an efficient index structure for content-based retrieval of images.

V. Oria, M.T. Özsu, D. Szafron and P.J. Iglinski. "Defining Views in an Image Database System". In Proc. 8th IFIP 2.6 Working Conference on Database Semantics (DS-8) "Semantic Issues in Multimedia Systems", Rotorua, New Zealand, 5-8 January 1999, pages 231-250.

Abstract: A view mechanism can help handle the complex semantics in emerging application areas such as image databases. This paper presents the view mechanism we defined for the DISIMA image database system. Since DISIMA is being developed on top of an object-oriented database system, we first propose a powerful object-oriented view mechanism based on the separation between types (interface functions) and classes that manage objects of the same type. The image view mechanism uses our object-oriented view mechanism to allow us to give different semantics to the same image. The solution is based on the distinction between physical salient objects which are interesting objects in an image and logical salient objects which are the meanings of these objects.

V. Oria, P. J. Iglinski, M. T. Özsu, "A Framework for Multimedia Database Systems", 4th African Conference on Research in Computer Science, Dakar, Senegal, October 1998, pages 293-304.

Abstract: This paper discusses a general framework for multimedia DBMSs. We use our two ongoing multimedia projects (a SGML/HyTime DBMS and an image DBMS) as examples to support the general framework. Although multimedia data (text, sound, image, video and animation) are different in terms of their processing, similarities exist in the way to handle these data. They all are big with complex structures and semantics. The semantics have to be extracted, stored, and indexed to allow content-based queries.

V. Oria, B. Xu, M. T. Özsu, "Visual MOQL: A Visual Query Language for Image Databases", 4th IFIP 2.6 Working Conference on Visual Database Systems - VDB 4, L'Aquila, Italy, May 1998, pages 186-191 (system prototype demonstration).

Abstract: Since most multimedia database systems are built on top of object or object-relational database systems, they inherit the underlying query facilities. The approach we present in this paper is in two steps. The first step is to design a multimedia query language that will be used as an internal language. The second step is to define an equivalent visual query language and a translator to translate a visual query into a query in the internal query language.

J.Z. Li and M.T. Özsu, "Point-Set Topological Relations Processing in Multimedia Databases", First International Forum on Multimedia and Image Processing, Anchorage, Alaska, May 1998, pages 541-546.

Abstract: Egenhofer and Franzosa's model of fundamental topological relations for spatial regions has received a lot of research attention in geographic information systems and spatial databases. We propose a new way of computing these topological relations with much less storage requirement. We investigate different cases where the new approach can perform even better than the original approach in terms of CPU time. All the experiments are run on top of a commercial object database management system. Some important factors which impact the performance of computing topological relations are also discussed in detail. An image database prototype has been built based on this new approach.

V. Oria, M.T. Özsu, L. Liu, X. Li, J.Z. Li, Y. Niu, and P. Iglinski, "Modeling Images for Content-Based Queries: The DISIMA Approach", Second International Conference on Visual Information Systems, San Diego, CA, December 1997, pages 339-346.

Abstract: The DISIMA project aims at building a complete image database system enabling content-based querying. The model, the architecture and a query language have been defined. The prototype is being implemented on top of the ObjectStore system. DISIMA proposes a model for both image and spatial applications. The DISIMA model allows the user to assign different semantics to an image component (semantic independence) and an image representation can be changed without any effect on applications using it (representation independence). The architecture involves different image sources including WWW-servers and file systems. This paper presents the project overview.

J.Z. Li and M. T. Özsu, "STARS: A SpaTial Attributes Retrieval System for Images and Videos", Proceedings of the 4th International Conference on Multimedia Modeling (MMM'97), Singapore, November 1997, pages 69-84.

Abstract: Combining both text-based retrieval and content-based retrieval techniques in building multimedia databases is the ultimate goal of this work. We describe an object-oriented multimedia database which supports such a combination. In supporting content-based image and video retrieval, we focus on spatial properties which is an essential part of any image and video retrieval system. We deal with both spatial similarity and spatial relationships. The system is further enhanced by a powerful multimedia query language and by incorporating two level spatial attributes: precise spatial attributes and their approximation.

Y. Niu, M.T. Özsu, X. Li 2D-h Trees: An Index Scheme for Content-Based Retrieval of Images in Multimedia Systems, IEEE International Conference On Intelligent Processing Systems 1997 (IEEE ICIPS'97), Beijing, China, October 1997, pages 1710-1715.

Abstract: An important feature to be considered in the design of a Multimedia Database Systems (MMDBS) is content-based retrieval of images. Spatial features represent the spatial relationships among objects in an image. The salient objects (interesting objects) can be organized in an object hierarchy, based on object-oriented concepts. This paper proposes a new indexing scheme, called 2d-h trees, for content-based retrieval of images. This scheme organizes the representations of the spatial relationships among objects in images and the hierarchical relationships among objects efficiently for query optimization. Our performance analysis indicates that the 2D-h tree is an efficient index scheme for content-based retrieval of images.

J. Z. Li, M.T. Özsu, D. Szafron, V. Oria. "MOQL: A Multimedia Object Query Language", The Third International Workshop on Multimedia Information Systems, Como, Italy, September 1997, pages 19-28.

Abstract: Declarative query languages are an important feature of database management systems and have played an important role in their success. As database management technology enters the multimedia information system domain, the availability of query languages for multimedia applications will be equally important. However, one common problem with currently existing multimedia query languages is their lack of generality. They are designed either for a certain medium (e.g. images) or special applications (e.g., medical, geographical information systems). We describe general multimedia queries based on the ODMG's Object Query Language (OQL). In order to capture the temporal and spatial relationships in multimedia data, OQL is extended by a set of multimedia primitives. These extended OQL also includes functions for query presentation. We illustrate the extended language features by query examples.

J. Z. Li, M.T. Özsu, D. Szafron, "Modeling of Moving Objects in a Video Database", Proceedings of IEEE International Conference on Multimedia Computing and Systems, Ottawa, Canada, June 1997, pages 336-343.

Abstract: Modeling moving objects has become a topic of increasing interest in the area of video databases. Two key aspects of such modeling are object spatial and temporal relationships. In this paper we introduce an innovative way to represent the trajectory of a single moving object and the relative spatio-temporal relations between multiple moving objects. The representation supports a rich set of spatial topological and directional relations. It also supports both quantitative and qualitative user queries about moving objects. Algorithms for matching trajectories and spatio-temporal relations of moving objects are designed to facilitate query processing. These algorithms can handle both exact and similarity matches. We also discuss the integration of our moving object model, based on a video model, in an object-oriented system. Some query examples are provided to further validate the expressiveness of our model.

J.Z. Li, I.A. Goralwalla, M.T. Özsu, D. Szafron, "Modeling Video Temporal Relationships in an Object Database Management System", IS&T/SPIE International Symposium on Electronic Imaging: Multimedia Computing and Networking, San Jose, USA, February 1997, pages 80-91.

Abstract: Video modeling has become a topic of increasing interest in the area of multimedia research. One of the key aspects or videos is the temporal relationship between video frames. In this paper we propose a tree-based model for specifying the temporal semantics of video data. We present a unique way of integrating our video model into an object database management system which has rich multimedia temporal operations. We further show how temporal histories are used to model video data. Using histories to model video data is both simple and natural. It also can lead to a uniform behavioral model. A user can then explore the video objectbase using object-oriented techniques. Such a seamless integration gives a uniform interface to end users. The integrated video objectbase management system supports a broad range of temporal queries and is extensible, thus allowing the easy incorporation of new features into the system.

Keywords: multimedia, temporal, object-oriented, database, video model, query, clips

J.Z. Li, M.T. Özsu, D. Szafron, "Spatial Reasoning Rules in Multimedia Management Systems", Third International Conference on Multimedia Modeling, Toulouse, France, November 1996, pages 119-133.

Abstract: In this paper we consider various spatial relationships that are of general interest for retrieving data from multimedia databases. We present a unified representation of spatial objects for both topological and directional relations. Such a representation is based on Allen's temporal interval algebra. We also present a set of spatial inference rules, which allow us to make heterogeneous spatial relation deductions from existing directional and topological relations. For example, if we know A north of B, B overlap with C, and C north of D, then we derive A above D. Since all the rules are propositional Horn clauses, they can be easily integrated into any multimedia database by using either a simple inference engine or a lookup table.

Keywords: spatial relation, topology, direction, interval, inference rule

J.Z. Li, M.T. Özsu, D. Szafron, "Modeling of Video Spatial Relationships in an Objectbase Management System", International Workshop on Multimedia DBMS, Blue Mountain Lake, NY, 1996, pages 124-133.

Abstract: A key aspect in video modeling is spatial relationships. In this paper we propose a spatial representation for specifying the spatial semantics of video data. Based on such a representation, a set of spatial relationships for salient objects is defined to support qualitative and quantitative spatial properties. The model captures both topological and directional spatial relationships. We present a novel way of incorporating this model into a video model, and integrating the abstract video model into an object database management system which has rich multimedia temporal operations. The integrated model is further enhanced by a spatial inference engine. The powerful expressiveness of our video model is validated by some query examples.

Keywords: multimedia, spatial, object-oriented, database, video model, query, clips

University of Waterloo

Computer Science

M.T. Özsu