Research

My research is on data management. Although I have done work in basic database technologies such as query processing, transaction processing, and database integration, the main focus of my research follows two threads: (1) application of database technology to non-traditional data types, and (2) distributed & parallel data management. These two threads usually converge

Investigating how database technology can be applied to data types that are more complex than business data processing (for which relational systems are the perfect fit) has always been one of my interests. Frank Tompa characterizes this type of work as Data Management(X) where X is the data type of interest -- we also have a graduate course with exactly this focus (CS 741). At different times in my career, X has been equal to one or more of the following: {"object" data (in the sense of object databases), multimedia data, temporal data, spatial data, XML, stream data}. Currently, my focus is on graph data and RDF data.

Graph data management

Graphs have always been important data types for database researchers. With the recent growth of social networks, Wikipedia, Linked Data, RDF, and other networks, the interest in managing very large graphs have again gained momentum. I have a number of projects in this space.

Publications

  1. A. Pacaci, A. Bonifati and M. T. Özsu, "Regular Path Query Evaluation on Streaming Graphs", Proc. ACM SIGMOD International Conference on Management of Data, pages 1415-1430, 2020.
  2. A. Sahu, A. Mhedhbi, S. Salihoglu, J. Lin, M. T. Özsu, "The Ubiquity of Large Graphs and Surprising Challenges of Graph Processing: Extended Survey", VLDB Journal, 29: 595–618, 2020.
  3. A. Pacaci and M. T. Özsu, "Analysis of Streaming Algorithms for Graph Partitioning", Proc. ACM SIGMOD International Conference on Management of Data, pages 1375–1392, 2019.
  4. X. Li and M. T. Özsu. "Correlation Constraint Shortest Path over Large Multi-Correlation Graphs", Proc. VLDB Endowment, 12(5): 488-501, 2019.
  5. K. Ammar and M. T. Özsu, "Experimental Analysis of Distributed Graph Systems," Proc. VLDB Endowment, 11(10): 1151-1164, 2018.
  6. A. Sahu, A. Mhedhbi, S. Salihoglu, J. Lin, M. T. Özsu, "The Ubiquity of Large Graphs and Surprising Challenges of Graph Processing," Proc. VLDB Endowment, 11(4): 420-431, 2018.

Other publications on this topic can be found here.

RDF data management

Resource Description Framework (RDF) has been proposed for modeling Web objects as part of developing the semantic web. It has also gained attention as a way to accomplish web data integration. For example, the Linking Open Data (LOD) cloud is a distributed RDF knowledge base created over hundreds of autonomous datasets. Currently, the LOD cloud contains more than 25 billion triples, and its size is doubling every year. As the volume of RDF data has increased, interesting data management issues have arisen. We study the processing of SPARQL queries over RDF data using a graph-theoretic approach: we represent both the RDF data and the SPARQL queries a graphs and conver the query evaluation problem toone of subgraph matching. My work in this area covers a number of topics:

Publications

  1. P. Peng, Q. Ge, L. Zou, M. T. Özsu, Z. Xu, and D. Zhao. "Optimizing Multi-Query Evaluation in Federated RDF Systems," IEEE Trans. Knowledge and Data Eng., 2019, Forthcoming.
  2. G. Aluç, M. T. Özsu, and K. Daudjee. "Clustering RDF Databases Using Tunable-LSH", VLDB Journal, 28(2): 173-195, 2019.
  3. L. Gao, L. Golab, M. T. Özsu, G. Aluç. "Stream WatDiv: A Streaming RDF Benchmark", Proc. International Workshop on Semantic Big Data, pages 1-6, 2018.
  4. O. Hartig and M. T. Özsu. "Walking without a Map: Ranking-Based Traversal for Querying Linked Data", Proc. 15th International Semantic Web Conference, pages 305–324, 2016.
  5. P. Peng, L. Zou, M. T. Özsu, L. Chen, D. Zhao. "Processing SPARQL Queries Over Distributed RDF Graphs", VLDB Journal, 25(2):243–268, 2016.
  6. G. Aluç, M. T. Özsu, K. Daudjee, and O. Hartig. "Executing queries over schemaless RDF databases", In Proc. 31st Int. Conf. on Data Engineering, pages 807 - 818, 2015.
  7. L. Zou, M. T. Özsu, L. Chen, X. Sheng, R. Huang, and D. Zhao. "gStore: A Graph-based SPARQL Query Engine," VLDB Journal, 23(4): 565-590, 2014.
  8. G. Aluç, O. Hartig, M. T. Özsu, and K. Daudjee. "Diversified stress testing of RDF data management systems", In Proc. 13th Int. Semantic Web Conference, Part I, pages 197–212, 2014.
  9. G. Aluç, M. T. Özsu, and K. Daudjee. "Workload matters: Why RDF databases need a new design", Proc. VLDB Endowment, 7(10):837–840, 2014.
  10. G. Aluç, M. T. Özsu, K. Daudjee, and O. Hartig. "chameleon-db: a workload-aware robust RDF data management system", Technical Report CS-2013-10, University of Waterloo, 2013.

Other publications on this topic can be found here.


Copyright © M. Tamer Özsu. All rights reserved.
Last update: July, 2020