CS 341: Algorithms (Winter 2017)

CS 848/858: Modern Data Processing Systems (Fall 2016)


I work on algorithms and systems that expand the capabilities of distributed large-scale data processing, with a particular emphasis on two problems: (1) performing large-scale joins of record-oriented data; and (2) processing large graphs. For join processing, my research started by formalizing a theoretical model for answering two questions within the context of the MapReduce system: (i) how difficult it is to parallelize different problems; and (ii) how optimal are existing algorithms for a given problem. Driven by the insights from studying the difficulty of parallelizing different fuzzy and equi-join problems and analyzing existing algorithms, I worked on new algorithms that have provably guarantees. For graph processing, my research has ranged from building an open-source platform for scalable graph processing, to algorithms for distributing graphs across machines, to developing extensions to existing graph processing systems with the goal of making them easier to program.


