Please note: This seminar will take place in DC 1304.
Stavros Sintos, Assistant Professor
Department of Computer Science, University of Illinois at Chicago
Exploring and analyzing relational data typically involves two costly steps: data preparation and data processing. During data preparation, tuples from multiple tables are joined to form a comprehensive dataset, while in the subsequent processing step, algorithms are applied to the join results for analysis. This two-step approach is often prohibitively expensive because join outputs can be polynomially larger than the total size of the input tables.
To address this challenge, we develop efficient approximation algorithms for various NP-complete optimization problems on join results, without explicitly materializing the join. Our work focuses on clustering problems and demonstrates how computational geometry enables the fastest known approximation algorithms for relational k-center, k-median, k-means, and related variants. Specifically, for a database instance D of size N and an acyclic join query Q, we introduce novel techniques that combine ideas from computational geometry and database theory to achieve constant-factor approximations in roughly O(kN) time, even when the number of the join results |Q(D)| is orders of magnitude larger than N.
Bio: Stavros Sintos is an Assistant Professor in the Department of Computer Science at the University of Illinois at Chicago (UIC). Before joining UIC, he was a Postdoctoral Scholar on Data Management in the Department of Computer Science at the University of Chicago. He obtained his Ph.D. in the Department of Computer Science at Duke University under the supervision of Prof. Pankaj K. Agarwal. He is a recipient of the James B. Duke Fellowship, and he was nominated for the 2019-2020 outstanding Ph.D. dissertation award for his thesis titled “Efficient Algorithms for Querying Large and Uncertain Data.” Recently, he got the best paper award at ICDT 2024.
His main research interest is in the design of efficient algorithms with theoretical guarantees for problems and queries in databases and data management. An important aspect of his research has been on combining geometric optimization with query processing. His work has been published in top-tier conferences and journals such as PODS, SIGMOD, VLDB, ICDT, and KDD. His research is supported by NSF and Google-CAHSI.
For more details, please visit his website.