Semih Salihoglu
firname [dot] lastname [at] uwaterloo [dot] ca
DC 3351
Meeting times: Tuesdays, 9am-12pm
Meeting location: DC 2568
Graphs are one of the most naturals data structures to represent connected entities in the real world. As such, many real world applications model their data in the form of a graph and analyze or query these graphs. Several of the numerous examples include neuro-scientists using brain networks to identify the functionality and importance of neurons, search engines, such as Google, using knowledge graphs for question answering, banks using transaction graphs to detect fraudulent money flow patterns, social networking applications, such as Facebook and LinkedIn, using their social networks to do recommendations. This seminar covers topics in graph analytics and data management that has enabled such applications. We will study the software systems for graph analytics and data management and cover algorithmic work focusing on several machine learning tasks, such as community detection, clustering, link prediction, and influence maximization. We will read a wide range of papers from different communities both within computer science, such as databases, machine learning, systems, and social network analysis, as well as outside of computer science such as biology and physics.
Having a background on systems, such as having taken courses in distributed systems, computer architecture, is not strictly necessary but helpful. Similarly, having a background on network analytics and graph algorithms is also not necessary but will be helpful for appreciating the papers on graph analytics.
The main workload will consist of a term-long project that students will do related to one of the topics covered in the seminar. The other workload includes paper reviews and two in-class presentations. In a typical seminar, we will have two paper presentations, roughly 25 minutes each, followed by about an hour of open discussions. Each week a different student will be leading the discussion. This is a seminar, so your participation in the class is very critical to everyone's learning. That is why it will be a significant part of your grade. Please ask questions and make comments throughout the discussions. The workload pieces and mark breakdown is as follows:
For each class we will be writing two reviews for two of the papers assigned to that day (except the first seminar). If there are more than two papers assigned, you can pick any two of the assigned papers. You are allowed to skip 3 reviews throughout the term. The reviews will be written in the form of peer-reviewed conference reviews. So, they should be critiques that describe the strengths and weaknesses of the paper and contain suggestions to make the paper stronger. Each review must be written in the following template:
Each student will be doing 2 presentations in the term. Each presentation will be about 25 minutes long. Here are the important points summarizing what you have to do for your presentations.
Date | Topic | Readings | Slides |
---|---|---|---|
W1: Jan 7 | Overview, Historic Graph Data Management Systems |
|
|
W2: Jan 14 | Modern Graph Database Management Systems |
|
|
W3: Jan 21 | Parallel Graph Analytics Systems 1: Vertex-centric Systems |
|
|
W4: Jan 28 | Parallel Graph Analytics Systems 2: Other Frontends |
|
|
W5: Feb 4 | Semantic Web and Knowledge Graph Management |
|
|
W6: Feb 11 | Linked Data and Knowledge Graphs |
Optional |
|
W7: Feb 25 | Structure and Properties of Real World Graphs |
|
|
W8: March 3 | Analytics Algorithms | ||
W9: March 10 | Graphs in Machine Learning | ||
W10: March 17 | Regular Path Queries | ||
W11: March 24 | Selected by students from the above list | ||
W12: March 31 | Streaming/Dynamic Graphs |