Seminar • Data Systems • Hyperscale Data Processing with Network-centric Designs

Tuesday, February 8, 2022 11:30 am - 11:30 am EST (GMT -05:00)

Please note: This seminar will be given online.

Qizhen Zhang
Department of Computer and Information Science, University of Pennsylvania

Today’s largest data processing workloads are hosted in cloud data centers. Due to exponential data growth and the end of Moore’s Law, these workloads have ballooned to hyperscale, encompassing billions to trillions of data items and hundreds to thousands of servers per query. Enabling and expanding with hyperscale data processing are highly scalable data center networks. Hyperscale fundamentally challenges the designs of data processing systems and data center networks. My research rethinks the interactions between these two layers and seeks the optimal solutions for supporting data processing in data centers and evolving the cloud infrastructure.

In this talk, I will present network-centric designs, a principled and cross-layer approach to building systems for hyperscale. It concerns data processing in both current networks and future networks, as well as how networks evolve. To demonstrate the efficiency of this approach, I will first discuss GraphRex, which combines classic database and systems techniques to push the performance of massive graph queries in current data centers. I will then introduce data processing in disaggregated data centers (DDCs), a promising new cloud proposal. In particular, I will detail TELEPORT, a system that allows data processing systems to unlock all DDC benefits, and Redy, a cloud service that realizes DDC features in today’s clouds. Finally, I will also show MimicNet, which facilitates network innovation at scale.

Bio: Qizhen Zhang is a Ph.D. candidate in the Department of Computer and Information Science at the University of Pennsylvania, advised by Vincent Liu and Boon Thau Loo. His dissertation research bridges cloud data processing systems and data center networks to address emerging challenges in hyperscale data processing. He is broadly interested in data management and computer systems and networking, and he researches across the data processing stack. His work appears at database and systems conferences such as SIGMOD, VLDB, and SIGCOMM.