Seminar • Data Systems • From Ad-hoc to Systematic Data System Design: Looking Through Index Dynamization

Monday, March 3, 2025 10:30 am - 11:30 am EST (GMT -05:00)

Please note: This seminar will take place in DC 1304.

Dong Xie, Assistant Professor
Computer Science and Engineering Department, Penn State University

Nowadays, numerous specialized data systems are built to accommodate different applications, hardware environments, and real-world constraints. However, it has never been easy to transform or integrate innovative data management ideas into practical solutions. Specifically, ad-hoc design procedures are often required to accommodate changes in feature requirements, system environments, and optimization goals. We found that these procedures might have redundant development efforts as they usually share common methodologies with subtle tweaks, which hints at the potential of a more systematic design path.

To demonstrate this, I will share a specific research line that leverages index-assisted sampling to achieve interactive data analysis and reflect our path to making sampling indexes practical. We started with an ad-hoc design effort extending sampling indexes with concurrency support, which we found disruptive. We then turned to a systematic dynamization approach to extend static sampling indexes with update/concurrency support and achieve comparable (or sometimes even better) performance. By generalizing this procedure, we define new classes of search problems whose supporting indexes can be extended in a similar manner and build a practical index extension framework to achieve such extension automatically.

With this in mind, I envision a new systematic path for practical data system design: we could decouple different design dimensions (e.g., update support, hardware adaptation, approximation, etc.) from data systems and achieve it through systematic design modules. I will highlight three major directions to achieve this goal and finally conclude with my outlook for future research that aligns with this vision.


Bio: Dong Xie is an assistant professor in the computer science and engineering department at Penn State University. He got his Ph.D. in Computer Science from University of Utah in 2020. He received the Google Research Scholar Award in 2023, Microsoft Research PhD Fellowship in 2018, and SoCC best paper runner-up in 2019.

His research interest lies in building data systems to address the challenges of processing and analyzing real-world large-scale data. His research spans multiple areas, including data systems on modern hardware, distributed databases, main-memory databases, stream processing systems, approximate query processing, spatio-temporal data processing, data privacy, and system security.